Purpose: Invalidity Analysis


Patent: US7693710B2
Filed: 2002-05-31
Issued: 2010-04-06
Patent Holder: (Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC
Inventor(s): Milan Jelinek, Philippe Gournay

Title: Method and device for efficient frame erasure concealment in linear predictive based speech codecs

Abstract: The present invention relates to a method and device for improving concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder ( 106 ) to a decoder ( 110 ), and for accelerating recovery of the decoder after non erased frames of the encoded sound signal have been received. For that purpose, concealment/recovery parameters are determined in the encoder or decoder. When determined in the encoder ( 106 ), the concealment/recovery parameters are transmitted to the decoder ( 110 ). In the decoder, erasure frame concealment and decoder recovery is conducted in response to the concealment/recovery parameters. The concealment/recovery parameters may be selected from the group consisting of: a signal classification parameter, an energy information parameter and a phase information parameter. The determination of the concealment/recovery parameters comprises classifying the successive frames of the encoded sound signal as unvoiced, unvoiced transition, voiced transition, voiced, or onset, and this classification is determined on the basis of at least a part of the following parameters: a normalized correlation parameter, a spectral tilt parameter, a signal-to-noise ratio parameter, a pitch stability parameter, a relative frame energy parameter, and a zero crossing parameter.




Disclaimer: The promise of Apex Standards Pseudo Claim Charting (PCC) [ Request Form ] is not to replace expert opinion but to provide due diligence and transparency prior to high precision charting. PCC conducts aggressive mapping (based on Broadest Reasonable, Ordinary or Customary Interpretation and Multilingual Translation) between a target patent's claim elements and other documents (potential technical standard specification or prior arts in the same or across different jurisdictions), therefore allowing for a top-down, apriori evaluation, with which, stakeholders can assess standard essentiality (potential strengths) or invalidity (potential weaknesses) quickly and effectively before making complex, high-value decisions. PCC is designed to relieve initial burden of proof via an exhaustive listing of contextual semantic mapping as potential building blocks towards a litigation-ready work product. Stakeholders may then use the mapping to modify upon shortlisted PCC or identify other relevant materials in order to formulate strategy and achieve further purposes.

Click on references to view corresponding claim charts.


Non-Patent Literature        WIPO Prior Art        EP Prior Art        US Prior Art        CN Prior Art        JP Prior Art        KR Prior Art

GroundReferencesOwner of the ReferenceTitleSemantic MappingChallenged Claims
12345678910111213141516171819202122232425
1

1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V. : 1591-1594 1997

(Supplee, 1997)
United States Department of Defense, Meade, MD, USAMELP: The New Federal Standard At 2400 Bps decoder determines concealment Linear Prediction
frame erasure PC mode
XXXXXXXXXXXXXXXXXXXXXXXXX
2

2002 IEEE SPEECH CODING WORKSHOP PROCEEDINGS. : 144-146 2002

(Salami, 2002)
VoiceAge CorporationThe Adaptive Multi-rate Wideband Codec: History And Performance onset frame first time
first non time t
XXXXXXXXXXXXXX
3

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING. 3 (1): 59-71 JAN 1995

(Chen, 1995)
AT&T Bell LabsADAPTIVE POSTFILTERING FOR QUALITY ENHANCEMENT OF CODED SPEECH communication link automatic gain control
first impulse, impulse responses frequency response
onset frame first time
XXXXXXXXXX
4

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING. 3 (1): 94-98 JAN 1995

(Kuo, 1995)
National Tsing Hua University (NTHU) Hsinchu, TaiwanSPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING sound signal, speech signal adaptive codebook
encoding parameters coding method
XXXXXXXXXXXXXXXXXXXXXXX
5

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING. 3 (4): 242-250 JUL 1995

(Mccree, 1995)
Georgia Institute of TechnologyA MIXED EXCITATION LPC VOCODER MODEL FOR LOW BIT-RATE SPEECH CODING recovery parameters linear predictive coding
LP filter background noise
XXXXXX
6

IEEE NETWORK. 12 (5): 40-48 SEP-OCT 1998

(Perkins, 1998)
University College London (UCL)A Survey Of Packet Loss Recovery Techniques For Streaming Audio concealing frame erasure forward error correction
decoder concealment, frame erasure concealment error concealment, packet loss
XXXXXXXXXXXXXXXXXXXXX
7

1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V. : 1331-1334 1997

(Scheirer, 1997)
Interval Research Corporation, Palo Alto, CA, USAConstruction And Evaluation Of A Robust Multifeature Speech/music Discriminator speech signal, decoder determines concealment speech signal
LP filter excitation signal when i
XXXXXXXXXXXXX
8

SPEECH COMMUNICATION. 12 (2): 193-204 JUN 1993

(Cuperman, 1993)
University of California, Santa Barbara, Simon Fraser UniversityLOW-DELAY SPEECH CODING decoder concealment, decoder recovery transmission error
sound signal, speech signal adaptive codebook
encoding parameters adaptive tree
E q pitch p
XXXXXXXXXXXXXXXXXXXXXXXXX
9

2002 IEEE SPEECH CODING WORKSHOP PROCEEDINGS. : 62-64 2002

(Morinaga, 2002)
The Nippon Telegraph and Telephone Corporation (日本電信電話株式会社, Nippon Denshin Denwa Kabushiki-gaisha, NTT)The Forward-backward Recovery Sub-codec (FB-RSC) Method: A Robust Form Of Packet-loss Concealment For Use In Broadband IP Networks LP filter excitation signal synthesized signal
frame concealment previous frames
current frame, decoder determines concealment current frame, packet loss
XXXXXX
10

2002 IEEE SPEECH CODING WORKSHOP PROCEEDINGS. : 47-49 2002

(Feldbauer, 2002)
Technische Universität Graz (TU Graz), AustriaSpeech Coding Using Motion Picture Compression Techniques speech signal, decoder determines concealment speech signal
average energy audio data
XXXXXXX
11

2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI. : 1375-1378 2000

(Wang, 2000)
Signalcom IncA 1200 Bps Speech Coder Based On MELP average energy improving performance
decoder determines concealment speech coder
XXX
12

2000 IEEE WORKSHOP ON SPEECH CODING, PROCEEDINGS. : 145-147 2000

(Erdmann, 2000)
University or Rheinisch-Westfälische Technische Hochschule Aachen (RWTH Aachen)An Adaptive Multi Rate Wideband Speech Codec With Adaptive Gain Re-quantization LP filter enhanced performance, background noise
decoder determines concealment, speech signal Linear Prediction, speech signal
pass filter lower band
XXXXXXXXXXXXX
13

2000 IEEE WORKSHOP ON SPEECH CODING, PROCEEDINGS. : 126-128 2000

(Wang, 2000)
Southern Methodist University (SMU), Dallas, TX, USAPerformance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks frame erasure frame erasure
encoding parameters coding method
decoder concealment, decoder determines concealment packet loss
LP filter when frame
XXXXXXXXXXXXXXXXXXXXXXXXX
14

ICASSP 99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI. : 5-8 1999

(Combescure, 1999)
Rheinisch-Westfälische Technische Hochschule Aachen (RWTH Aachen University)A 16, 24, 32 Kbit/s Wideband Speech Codec Based On ATCELP frame erasure concealment frame erasure concealment
decoder determines concealment Linear Prediction
frame concealment CELP codec
XXXXXXXXXXXXXXXX
15

ICASSP 99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI. : 197-200 1999

(Mustapha, 1999)
COMSAT Lab., Clarksburg, MD, USAAn Adaptive Post-filtering Technique Based On The Modified Yule-Walker Filter pass filter, LP filter flat frequency response, pass filter
decoder determines concealment speech coder
average pitch, E q high p
XXXXXXXX
16

1997 IEEE WORKSHOP ON SPEECH CODING FOR TELECOMMUNICATIONS, PROCEEDINGS. : 75-76 1997

(Swaminathan, 1997)
Hughes Network SystemsA Robust Low Rate Voice Codec For Wireless Communications signal energy Wireless Communication
decoder concealment, decoder recovery transmission error
LP filter background noise
average energy fixed codebook
XXXXXXXXXXXXXXXXXXX
17

US6260009B1

(Andrew P. DeJaco, 2001)
(Original Assignee) Qualcomm Inc     

(Current Assignee)
Qualcomm Inc
CELP-based to CELP-based vocoder packet translation first impulse said model
E q pitch p
XXXXXX
18

US6233550B1

(Allen Gersho, 2001)
(Original Assignee) University of California     

(Current Assignee)
University of California
Method and apparatus for hybrid coding of speech at 4kbps onset frame successive frames
last frame, replacement frame speech encoder
encoding parameters coding method
XXXXXXXX
19

US5864798A

(Kimio Miseki, 1999)
(Original Assignee) Toshiba Corp     

(Current Assignee)
Toshiba Corp
Method and apparatus for adjusting a spectrum shape of a speech signal LP filter, LP filter excitation signal autocorrelation coefficients, second filters
decoder concealment, decoder determines concealment speech signal output
XXXXXX
20

US5651092A

(Jun Ishii, 1997)
(Original Assignee) Mitsubishi Electric Corp     

(Current Assignee)
Mitsubishi Electric Corp
Method and apparatus for speech encoding, speech decoding, and speech post processing first impulse predetermined characteristic
decoder determines concealment encoded speech signal
decoder recovery, decoder constructs decoding apparatus
encoding parameters coding method
XXXXXXXXXXXXXXXX
21

EP0747883A2

(Peter Kroon, 1996)
(Original Assignee) AT&T Corp; AT&T IPM Corp     

(Current Assignee)
AT&T Corp
Voiced/unvoiced classification of speech for use in speech decoding during frame erasures first impulse said second portion
determining concealment predictive filter
average energy fixed codebook
current frame, decoder determines concealment current frame
LP filter comprises i
XXXXXXXXXXX
22

US5701392A

(Jean-Pierre Adoul, 1997)
(Original Assignee) Universite de Sherbrooke     

(Current Assignee)
Universite de Sherbrooke
Depth-first algebraic-codebook search for fast coding of speech decoder determines concealment encoded speech signal
last non predetermined number, last non
pass filter different position
signal classification parameter selecting step
encoding parameters coding method
recovery parameters said path
E q order r
XXXXXXXXXXXXXXXXXXXXX
23

US5754976A

(Jean-Pierre Adoul, 1998)
(Original Assignee) Universite de Sherbrooke     

(Current Assignee)
Universite de Sherbrooke
Algebraic codebook with signal-selected pulse amplitude/position combinations for fast coding of speech decoder determines concealment encoded speech signal
average pitch value following inequality
pass filter different position
signal classification parameter selecting step
sound signal sound signal
XXXXXXXXXXXXXXXXXXXXXXX
24

US5664055A

(Peter Kroon, 1997)
(Original Assignee) Nokia of America Corp     

(Current Assignee)
BlackBerry Ltd
CS-ACELP speech compression system with adaptive pitch prediction filter gain based on a measure of periodicity signal classification parameter linear prediction filter
speech signal, decoder determines concealment second output signal, speech signal
last frame, replacement frame speech encoder
average energy fixed codebook
average pitch value lower limit
XXXXXXXXXXXXXXXXXXXXXXX
25

US5699485A

(Yair Shoham, 1997)
(Original Assignee) Nokia of America Corp     

(Current Assignee)
BlackBerry Ltd
Pitch delay modification during frame erasures sound signal, speech signal adaptive codebook, speech signal
LP filter comprises i
XXXXXXXXXXXXXXXXXXXXXXXXX
26

US5732389A

(Peter Kroon, 1998)
(Original Assignee) Nokia of America Corp     

(Current Assignee)
Nokia of America Corp
Voiced/unvoiced classification of speech for excitation codebook selection in celp speech decoding during frame erasures first impulse said second portion
determining concealment predictive filter
average energy fixed codebook
current frame, decoder determines concealment current frame
LP filter comprises i
XXXXXXXXXXX
27

US5699482A

(Jean-Pierre Adoul, 1997)
(Original Assignee) Universite de Sherbrooke     

(Current Assignee)
Universite de Sherbrooke
Fast sparse-algebraic-codebook search for efficient speech coding pass filter different position
decoder determines concealment Linear Prediction
impulse responses, impulse response impulse response
recovery parameters function F
XXXXXXXX
28

US5701390A

(Daniel W. Griffin, 1997)
(Original Assignee) Digital Voice Systems Inc     

(Current Assignee)
Digital Voice Systems Inc
Synthesis of MBE-based coded speech using regenerated phase information decoder determines concealment random noise
LP filter comprises i
XXXXXX
29

JPH08110799A

(Jiyoutarou Ikedo, 1996)
(Original Assignee) Nippon Telegr & Teleph Corp <Ntt>; 日本電信電話株式会社     ベクトル量子化方法及びその復号化器 sound signal, speech signal 音声信号
first impulse 入力音
comfort noise の雑音
XXXXXXXXXXXXXXXXXXXXXXX
30

US5444816A

(Jean-Pierre Adoul, 1995)
(Original Assignee) Universite de Sherbrooke     

(Current Assignee)
Universite de Sherbrooke
Dynamic codebook for efficient speech coding based on algebraic codes speech signal frequency characteristics
recovery parameters linear predictive coding
pass filter different position
signal classification parameter selecting step
XXXXXXXXXXXXXXXXXXXXX
31

US5122875A

(Dipankar Raychaudhuri, 1992)
(Original Assignee) General Electric Co     

(Current Assignee)
General Electric Co
An HDTV compression system concealing frame erasure forward error correction
onset frame second data stream
decoder constructs video signal data
signal classification parameter, speech signal respective frame
current frame, communication link current frame, discrete cosine
maximum amplitude control signal
encoding parameters coding method
average energy audio data
energy information parameter image area
XXXXXXXXXXXXXXXXXXXXXXXXX
32

US4707857A

(John Marley, 1987)
(Original Assignee) John Marley; Kurt Marley     Voice command recognition system having compact significant feature data last non predetermined number
recovery parameters repeating step
current frame analog signal
maximum amplitude said system
XXXXXXXXXX
33

JP2002100994A

(Pasi Ojala, 2002)
(Original Assignee) Nokia Mobile Phones Ltd; ノキア モービル フォーンズ リミティド     媒体ストリームのスケーラブル符号化方法、スケーラブルエンコーダおよびマルチメディア端末 conducting frame erasure concealment の少なくとも1
current frame, replacement frame 備えるマルチ, ワーク
maximum amplitude 有すること
decoder concealment コアデータ
determining concealment 決定手段
sound signal 102
XXXXXXXXXXXXXXXXXXXXXXXXX
34

CN1344067A

(F·伍帕曼, 2002)
(Original Assignee) 皇家菲利浦电子有限公司     采用不同编码原理的传送系统 first impulse 至少其中一个
LP filter excitation signal 从第一
XXXXXXXX
35

US20010023396A1

(Allen Gersho, 2001)
(Original Assignee) Allen Gersho; Eyal Shlomot; Vladimir Cuperman; Chunyan Li     Method and apparatus for hybrid coding of speech at 4kbps onset frame successive frames
signal classification parameter preceding frame
last frame, replacement frame speech encoder
encoding parameters coding method
XXXXXXXXXXXXXXXXXXX
36

JP2002118517A

(Kenichi Makino, 2002)
(Original Assignee) Sony Corp; ソニー株式会社     直交変換装置及び方法、逆直交変換装置及び方法、変換符号化装置及び方法、並びに復号装置及び方法 onset frame フレーム間
sound signal, speech signal 音声信号
last frame 変換係数
determining concealment 決定手段
average pitch 上記逆
XXXXXXXXXXXXXXXXXXXXXXX
37

EP1096477A2

(Akira Inoue, 2001)
(Original Assignee) Sony Corp     

(Current Assignee)
Sony Corp
Apparatus for converting reproducing speed and method of converting reproducing speed last non predetermined number
recovery parameters delay means
pass filter pass filter
XXXXXX
38

EP1199812A1

(Stefan Bruhn, 2002)
(Original Assignee) Telefonaktiebolaget LM Ericsson AB     

(Current Assignee)
Telefonaktiebolaget LM Ericsson AB
Perceptually improved encoding of acoustic signals sound signal second buffer memory
energy information parameter, phase information parameter coded representation
first non overlapping region
LP filter excitation signal represents a
XXXXXXXXXXXXXXXXXXXXXXXXX
39

EP1087379A2

(Soichi Pioneer Corporation Toyama, 2001)
(Original Assignee) Pioneer Corp     

(Current Assignee)
Pioneer Corp
Quantization errors correction method in a audio decoder comfort noise quantization errors
pitch period coding device
encoding parameters coding method
average pitch value, concealing frame erasure code values
XXXXXXXXXXXXX
40

EP1132892A1

(Kazutoshi Yasunaga, 2001)
(Original Assignee) Panasonic Corp     

(Current Assignee)
Panasonic Corp
Voice encoder and voice encoding method sound signal, speech signal adaptive codebook
controlling energy medium storing
decoder determines concealment speech coder
XXXXXXXXXXXXXXXXXXXXXXX
41

EP1074976A2

(Tadashi Maison de Wings D-gou 449-16 Araki, 2001)
(Original Assignee) Ricoh Co Ltd     

(Current Assignee)
Ricoh Co Ltd
Block switching based subband audio coder current frame comparison means
first non time t
XXXXXXXXXXXX
42

EP1047047A2

(Kazuaki Nippon Telegraph/Telephone Corp. Chikira, 2000)
(Original Assignee) Nippon Telegraph and Telephone Corp     

(Current Assignee)
Nippon Telegraph and Telephone Corp
Audio signal coding and decoding methods and apparatus and recording media with programs therefor decoder recovery, decoder constructs decoding apparatus
last subframe fixed number
first non time t
XXXXXXXXXXXXXXXXXXXX
43

US6377915B1

(Seishi Sasaki, 2002)
(Original Assignee) YRP Advanced Mobile Communication Systems Res Labs Co Ltd     

(Current Assignee)
YRP ADVANCED MOBILE COMMUNICATION SYSTEMS RESEARCH LABORATORIES Co Ltd ; YRP Advanced Mobile Communication Systems Res Labs Co Ltd
Speech decoding using mix ratio table placing remaining impulse responses predetermined frequency
signal energy, LP filter lowest frequency
last frame, replacement frame speech encoder
speech signal, decoder determines concealment speech signal
encoding parameters coding method
XXXXXXXXXXXXXXXXX
44

JP2001166800A

(Yuusuke Hiwazaki, 2001)
(Original Assignee) Nippon Telegr & Teleph Corp <Ntt>; 日本電信電話株式会社     音声符号化方法及び音声復号化方法 decoder determines concealment 音声符号化方法
frame erasure, concealing frame erasure フレームごと
sound signal, speech signal 音声信号, 音声復号
first impulse の平均
pitch period 周期性
XXXXXXXXXXXXXXXXXXXXXXXXX
45

US6393390B1

(Jayesh S. Patel, 2002)
(Original Assignee) DSP Software Engineering Inc     

(Current Assignee)
Telecom Holding Parent LLC
LPAS speech coder using vector quantized, multi-codebook, multi-tap pitch predictor and optimized ternary source excitation codebook derivation decoder determines concealment speech coder
communication link loop manner
XXXXXXXXX
46

JP2001144733A

(Masayuki Kitagawa, 2001)
(Original Assignee) Nec Corp; Nec Viewtechnology Ltd; エヌイーシービューテクノロジー株式会社; 日本電気株式会社     音声伝送装置及び音声伝送方法 frame erasure concealment, frame concealment フレーム番号, エラー
frame erasure, concealing frame erasure フレームごと
maximum amplitude 有すること
speech signal 音声情報, 前記送信
onset frame PSK
XXXXXXXXXXXXXXXXXXXXXXXXX
47

JP2001117573A

(Kimio Miseki, 2001)
(Original Assignee) Toshiba Corp; 株式会社東芝     音声スペクトル強調方法/装置及び音声復号化装置 maximum amplitude 有すること
sound signal, speech signal 音声信号, 音声復号
XXXXXXXXXXXXXXXXXXXXXXX
48

JP2001051698A

(Seiji Sasaki, 2001)
(Original Assignee) Yrp Kokino Idotai Tsushin Kenkyusho:Kk; 株式会社ワイ・アール・ピー高機能移動体通信研究所     音声符号化復号方法および装置 speech signal, sound signal 音声復号方法, 音声信号
decoder concealment, decoder recovery 符号化器
first non 拡散処理
pass filter, LP filter ローパス, バンド
impulse responses 生成器, 音発生
XXXXXXXXXXXXXXXXXXXXXXX
49

JP2001013998A

(Hiroyuki Ebara, 2001)
(Original Assignee) Matsushita Electric Ind Co Ltd; Nec Corp; 日本電気株式会社; 松下電器産業株式会社     音声復号化装置及び符号誤り補償方法 speech signal 音声復号, する記録
determining concealment 決定手段
XXXXXXX
50

WO9966494A1

(Grant Ian Ho, 1999)
(Original Assignee) Comsat Corporation     Improved lost frame recovery techniques for parametric, lpc-based speech coding systems onset frame successive frames
average energy fixed codebook
signal energy signal energy
frame concealment, decoder concealment second frames, lost frame
LP filter comprises i
XXXXXXXXXXXXXXXX
51

CN1274456A

(S·P·维勒特, 2000)
(Original Assignee) 萨里大学     语音编码器 first non, first impulse response 的第二个
conducting frame erasure concealment, frame erasure concealment 事先确定, 谱幅度
LP filter excitation signal 从第一
XXXXXXXXXXXXXXXXXXXXXXX
52

US6351730B2

(Juin-Hwey Chen, 2002)
(Original Assignee) Nokia of America Corp     

(Current Assignee)
Nokia of America Corp
Low-complexity, low-delay, scalable and embedded speech and audio coding with adaptive frame loss concealment encoding parameters contains information, coding method
LP filter excitation signal delay processing, formula i
controlling energy desired bit rate
pitch period successive time
signal classification parameter preceding frame, sample values
decoder determines concealment, current frame Discrete Cosine, current frame
LP filter comprises i
comfort noise main band
replacement frame time lag
average energy point D
XXXXXXXXXXXXXXXXXXXXXXX
53

JP2000267700A

(Seiji Sasaki, 2000)
(Original Assignee) Yrp Kokino Idotai Tsushin Kenkyusho:Kk; 株式会社ワイ・アール・ピー高機能移動体通信研究所     音声符号化復号方法および装置 decoder determines concealment, decoder concealment 音声符号化方法, 符号化器
speech signal, sound signal 音声信号, 音声情報
first non 拡散処理
pass filter, LP filter ローパス
first impulse 入力音
impulse responses 生成器, 音発生
XXXXXXXXXXXXXXXXXXXXXXX
54

US6266632B1

(Kiminori Kato, 2001)
(Original Assignee) Matsushita Graphic Communication Systems Inc     

(Current Assignee)
Panasonic System Solutions Japan Co Ltd
Speech decoding apparatus and speech decoding method using energy of excitation parameter last non predetermined number
average energy correction unit, first sum
encoding parameters coding method
LP filter excitation signal represents a, when i
XXXXXXXXXXX
55

US6240387B1

(Andrew P. DeJaco, 2001)
(Original Assignee) Qualcomm Inc     

(Current Assignee)
Qualcomm Inc
Method and apparatus for performing speech frame encoding mode selection in a variable rate encoding system recovery parameters linear predictive coding
current frame third threshold
determining concealment third encoding
decoder determines concealment speech coder
XXXX
56

JP2000214900A

(Ko Amada, 2000)
(Original Assignee) Toshiba Corp; 株式会社東芝     音声符号化/復号化方法 decoder determines concealment 音声符号化方法
sound signal, speech signal 音声信号, 音声復号
XXXXXXXXXXXXXXXXXXXXXXX
57

EP0932141A2

(Ralf Kirchherr, 1999)
(Original Assignee) Deutsche Telekom AG     

(Current Assignee)
Deutsche Telekom AG
Method for signal controlled switching between different audio coding schemes decoder concealment, frame erasure concealment error concealment, current frame
LP filter domain decoder
frame erasure frame erasure
XXXXXXXXXXXXXXXXXXXXXXXXX
58

US5987406A

(Tero Honkanen, 1999)
(Original Assignee) Universite de Sherbrooke     

(Current Assignee)
Universite de Sherbrooke
Instability eradication for analysis-by-synthesis speech codecs decoder determines concealment encoded speech signal
maximum amplitude signal component
comfort noise high value
XXXXXX
59

US6311154B1

(Allen Gersho, 2001)
(Original Assignee) Nokia Mobile Phones Ltd     

(Current Assignee)
Nokia Mobile Phones Ltd ; Microsoft Technology Licensing LLC
Adaptive windows for analysis-by-synthesis CELP-type speech coding decoder determines concealment, decoder concealment encoded speech signal, pitch period
comfort noise codebook excitation
sound signal, speech signal adaptive codebook
average energy fixed codebook
XXXXXXXXXXXXXXXXXXXXXXX
60

JPH11259098A

(Ko Amada, 1999)
(Original Assignee) Toshiba Corp; 株式会社東芝     音声符号化/復号化方法 speech signal 音声復号
energy information parameter 最適化
XXXXXXXXXXXXXXXXXXXXX
61

US6385576B2

(Tadashi Amada, 2002)
(Original Assignee) Toshiba Corp     

(Current Assignee)
Toshiba Corp
Speech encoding/decoding method using reduced subframe pulse positions having density related to pitch decoder recovery, decoder constructs decoding apparatus
sound signal, speech signal adaptive codebook
encoding parameters coding method
controlling energy large power
XXXXXXXXXXXXXXXXXXXXXXX
62

US6226606B1

(Alejandro Acero, 2001)
(Original Assignee) Microsoft Corp     

(Current Assignee)
Zhigu Holdings Ltd
Method and apparatus for pitch tracking decoder determines concealment speech coder
impulse responses time window
LP filter comprises i, steps c
XXXXXXXX
63

US6310915B1

(Aaron Wells, 2001)
(Original Assignee) Harmonic Inc     

(Current Assignee)
LSI Corp ; Harmonic Inc ; Divicom Inc
Video transcoder with bitstream look ahead for rate control and statistical multiplexing energy information parameter, phase information parameter coded representation
encoding parameters encoding parameters
decoder concealment, frame erasure concealment error concealment
pass filter pass filter
first impulse said model
XXXXXXXXXXXXXXXXXXXXX
64

US6373842B1

(Paul Coverdale, 2002)
(Original Assignee) Nortel Networks Ltd     

(Current Assignee)
Microsoft Technology Licensing LLC
Unidirectional streaming services in wireless systems communication link communication link
first non, last non said transmission, time t
maximum amplitude control signal
last frame, current frame data frames
LP filter comprises i
XXXXXXXXXXXXXXXXXXXXXXXXX
65

EP0911807A2

(Masayuki C/O Sony Corporation Nishiguchi, 1999)
(Original Assignee) Sony Corp     

(Current Assignee)
Sony Corp
Sound synthesizing method and apparatus, and sound band expanding method and apparatus impulse responses, impulse response impulse response
speech signal sound parameters
pitch period, signal classification parameter input sound
first non first one
pass filter band stop
XXXXXXXXXXXXXXXXXXXXXXXXX
66

JP2000132194A

(Kenichi Makino, 2000)
(Original Assignee) Sony Corp; ソニー株式会社     信号符号化装置及び方法、並びに信号復号装置及び方法 frame erasure, frame erasure concealment 線形予測符号化
maximum amplitude 有すること
XXXXXXXXXXXXXXXXXXXXXXXXX
67

US6289297B1

(Paramvir Bahl, 2001)
(Original Assignee) Microsoft Corp     

(Current Assignee)
Microsoft Technology Licensing LLC
Method for reconstructing a video frame received from a video source over a communication channel encoding parameters contains information
decoder concealment, decoder recovery transmission error
LP filter steps c
XXXXXXXXXXXXXXXXXX
68

US6202045B1

(Pasi Ojala, 2001)
(Original Assignee) Nokia Mobile Phones Ltd     

(Current Assignee)
Provenance Asset Group LLC ; Nokia USA Inc
Speech coding with variable model order linear prediction determining concealment, encoding parameters autocorrelation function, line spectral frequency
recovery parameters reflection coefficients
speech signal, decoder determines concealment speech signal
frame erasure PC mode
XXXXXXXXXXXXXXXXXXXXXXXXX
69

US6385573B1

(Yang Gao, 2002)
(Original Assignee) Lakestar Semi Inc     

(Current Assignee)
Samsung Electronics Co Ltd
Adaptive tilt compensation for synthesized speech residual speech signal, decoder determines concealment speech signal
LP filter comprises i
XXXXXXXXXXXXX
70

US20010023395A1

(Huan-Yu Su, 2001)
(Original Assignee) Lakestar Semi Inc     

(Current Assignee)
Samsung Electronics Co Ltd
Speech encoder adaptively applying pitch preprocessing with warping of target signal encoding parameters, determining concealment line spectral frequency
sound signal, speech signal adaptive codebook, speech signal
last frame, replacement frame speech encoder
E q pitch p
XXXXXXXXXXXXXXXXXXXXXXXXX
71

US6260010B1

(Yang Gao, 2001)
(Original Assignee) Lakestar Semi Inc     

(Current Assignee)
MACOM Technology Solutions Holdings Inc
Speech encoder using gain normalization that combines open and closed loop gains sound signal, speech signal adaptive codebook
LP filter background noise
average energy fixed codebook
average pitch value maximum limit
XXXXXXXXXXXXXXXXXXXXXXXXX
72

US6330533B2

(Huan-Yu Su, 2001)
(Original Assignee) Lakestar Semi Inc     

(Current Assignee)
Samsung Electronics Co Ltd
Speech encoder adaptively applying pitch preprocessing with warping of target signal encoding parameters, determining concealment line spectral frequency
sound signal, speech signal adaptive codebook
last frame, replacement frame speech encoder
E q pitch p
XXXXXXXXXXXXXXXXXXXXXXXXX
73

US6104992A

(Yang Gao, 2000)
(Original Assignee) Lakestar Semi Inc     

(Current Assignee)
Hanger Solutions LLC
Adaptive gain reduction to produce fixed codebook target signal decoder concealment, decoder recovery second residual signal, first residual signal
first impulse filtered signal
average energy fixed codebook
speech signal, decoder determines concealment speech signal
XXXXXXXXXXXXXXXXXXX
74

US6188980B1

(Jes Thyssen, 2001)
(Original Assignee) Lakestar Semi Inc     

(Current Assignee)
Samsung Electronics Co Ltd
Synchronized encoder-decoder frame concealment using speech coding parameters including line spectral frequencies and filter coefficients encoding parameters, determining concealment line spectral frequency
last frame, replacement frame speech encoder
speech signal, decoder determines concealment speech signal
frame erasure frame erasure
XXXXXXXXXXXXXXXXXXXXXXXXX
75

US6122608A

(Alan V. McCree, 2000)
(Original Assignee) Texas Instruments Inc     

(Current Assignee)
Texas Instruments Inc
Method for switched-predictive quantization determining concealment autocorrelation function
first impulse, first impulse response weighting function
decoder determines concealment Linear Prediction
impulse responses, impulse response impulse response
last subframe weighting value
controlling energy second target
XXXXXXXXX
76

JP2000059231A

(Yukio Fujii, 2000)
(Original Assignee) Hitachi Ltd; 株式会社日立製作所     圧縮音声エラー補償方法およびデータストリーム再生装置 frame erasure concealment エラーフラグ
last subframe の誤り検出
speech signal の音声信号
decoder determines concealment サンプル値
average pitch 上記逆
XXXXXXXXX
77

JPH1198090A

(Kiyoko Tanaka, 1999)
(Original Assignee) Nec Corp; 日本電気株式会社     音声符号化/復号化装置 first impulse 入力音
average pitch per
comfort noise の雑音
determining concealment ELP
XXXX
78

US6081776A

(Mark Lewis Grabb, 2000)
(Original Assignee) Lockheed Martin Corp     

(Current Assignee)
Lockheed Martin Corp
Speech coding system and method including adaptive finite impulse response filter encoding parameters, decoder determines concealment spectral frequency
impulse responses, impulse response impulse response
XXXXXXXX
79

CN1243615A

(M·A·琼斯, 2000)
(Original Assignee) 艾利森公司     用于控制信号的振幅电平的方法和装置 preceding impulse response 该输出信号, 输入信
recovery parameters 接收一
XX
80

WO9913570A1

(Mark A. Jones, 1999)
(Original Assignee) Ericsson Inc.     Method and apparatus for controlling signal amplitude level maximum amplitude control signal
LP filter excitation signal input terminal
pass filter pass filter
LP filter steps c
XXXXXXXXXX
81

US6115687A

(Naoya Tanaka, 2000)
(Original Assignee) Panasonic Corp     

(Current Assignee)
III Holdings 12 LLC
Sound reproducing speed converter impulse response inverse filter
decoder concealment, pitch period pitch period
XXXXXXXXXX
82

US6029126A

(Henrique S. Malvar, 2000)
(Original Assignee) Microsoft Corp     

(Current Assignee)
Microsoft Technology Licensing LLC
Scalable audio coder and decoder first impulse, first impulse response weighting function
decoder constructs entropy encoder
XX
83

US6141638A

(Weimin Peng, 2000)
(Original Assignee) Motorola Solutions Inc     

(Current Assignee)
Google Technology Holdings LLC
Method and apparatus for coding an information signal signal classification parameter, phase information parameter predetermined parameters
speech signal, decoder determines concealment speech signal
XXXXXXXXXXXXXXXXXXXXX
84

US6108626A

(Luca Cellario, 2000)
(Original Assignee) Robert Bosch GmbH; Centro Studi e Laboratori Telecomunicazioni SpA (CSELT)     

(Current Assignee)
CSELT- CENTRO STUDI E LABORATORI TELECOMUNICAZIONI SpA ; Robert Bosch GmbH ; Centro Studi e Laboratori Telecomunicazioni SpA (CSELT) ; Nuance Communications Inc
Object oriented audio coding encoding parameters speech signal processing
pitch period predetermined bandwidth
last non said processing unit, predetermined number
decoder determines concealment second decoding unit, first decoding unit
signal energy, LP filter lowest frequency, coding devices
controlling energy desired bit rate
current frame given frequency
comfort noise other bands
XXXXXXXXXXXXXXXXXX
85

WO9953479A1

(Mohammed Javed Absar, 1999)
(Original Assignee) Sgs-Thomson Microelectronics Asia Pacific (Pte) Ltd.     Fast frame optimisation in an audio encoder encoding parameters encoding parameters
signal classification parameter different types
first non, last non output bits
last frame data blocks
XXXXXXXXXXXXXXXXXXXXXXX
86

US6208962B1

(Kazunori Ozawa, 2001)
(Original Assignee) NEC Corp     

(Current Assignee)
NEC Corp
Signal coding system last non predetermined number
frame concealment, decoder determines concealment residual error
XXXXXX
87

US6236961B1

(Kazunori Ozawa, 2001)
(Original Assignee) NEC Corp     

(Current Assignee)
NEC Corp
Speech signal coder LP filter, LP filter excitation signal inverse filtering
impulse responses, impulse response impulse responses
E q pitch p
XXXXXXXX
88

GB2324689A

(John Clark Hardwick, 1998)
(Original Assignee) Digital Voice Systems Inc     

(Current Assignee)
Digital Voice Systems Inc
Dual subframe quantisation of spectral magnitudes decoder determines concealment, speech signal Discrete Cosine, speech signal
last subframe last subframe
decoder recovery decoded bits
XXXXXXXXXXXXXXXXXXX
89

US6167375A

(Kimio Miseki, 2000)
(Original Assignee) Toshiba Corp     

(Current Assignee)
Toshiba Corp
Method for encoding and decoding a speech signal including background noise decoder recovery, decoder constructs decoding apparatus
⁢ E first decode
XXXXXXXXXXXXXXXXXX
90

US6064954A

(Gilad Cohen, 2000)
(Original Assignee) International Business Machines Corp     

(Current Assignee)
Cisco Technology Inc
Digital audio signal coding energy information parameter, phase information parameter coded representation
impulse responses signal samples
LP filter excitation signal represents a
XXXXXXXXXXXXXXXXXXXXX
91

US6134518A

(Gilad Cohen, 2000)
(Original Assignee) International Business Machines Corp     

(Current Assignee)
Cisco Technology Inc
Digital audio signal coding using a CELP coder and a transform coder energy information parameter, phase information parameter coded representation
impulse responses signal samples
⁢ E first decode
average energy audio data
LP filter steps c
XXXXXXXXXXXXXXXXXXXXX
92

US6263312B1

(Victor D. Kolesnik, 2001)
(Original Assignee) Alaris Inc; G T Tech Inc     

(Current Assignee)
XVD TECHNOLOGY HOLDINGS Ltd (IRELAND)
Audio compression and decompression employing subband decomposition of residual signal and distortion reduction decoder concealment, decoder recovery second residual signal, first residual signal
frame concealment second frames
⁢ E first decode
average energy audio data
XXXXXXXXXXXXXXXXXXX
93

US5963897A

(Manel Guberna Alpuente, 1999)
(Original Assignee) Lernout and Hauspie Speech Products NV     

(Current Assignee)
Nuance Communications Inc
Apparatus and method for hybrid excited linear prediction speech encoding frame erasure concealment, conducting frame erasure concealment redundancy information
recovery parameters repeating step
speech signal, decoder determines concealment speech signal
last subframe fixed number
XXXXXXXXXXXXXXXXXXX
94

JPH11184498A

(Kimio Miseki, 1999)
(Original Assignee) Toshiba Corp; 株式会社東芝     音声符号化/復号化方法 decoder determines concealment 音声符号化方法
maximum amplitude 有すること
sound signal, speech signal 音声信号, 音声復号
first impulse 入力音
XXXXXXXXXXXXXXXXXXXXXXX
95

US6009388A

(Kazunori Ozawa, 1999)
(Original Assignee) NEC Corp     

(Current Assignee)
NEC Corp
High quality speech code and coding method comfort noise linear prediction coefficient
last non predetermined number, second line
LP filter, LP filter excitation signal inverse filtering, impulse response
sound signal, speech signal adaptive codebook
encoding parameters coding method
average pitch value judging unit
pitch period time length
E q pitch p
XXXXXXXXXXXXXXXXXXXXXXXXX
96

US5870412A

(Guido M. Schuster, 1999)
(Original Assignee) 3Com Corp     

(Current Assignee)
HP Inc ; Hewlett Packard Enterprise Development LP
Forward error correction system for packet based real time media concealing frame erasure forward error correction
last non, first non predetermined number, said transmission
recovery parameters repeating step
LP filter excitation signal represents a
decoder determines concealment lost packets
XXXXXXXXXXXXXXXXXX
97

WO9827543A2

(Eric D. Scheirer, 1998)
(Original Assignee) Interval Research Corporation     Multi-feature speech/music discrimination system average energy dimensional feature space, feature values
average pitch, average pitch value modulation frequencies
onset frame successive frames
impulse responses signal samples
pass filter pass filter
XXXXX
98

US6199037B1

(John C. Hardwick, 2001)
(Original Assignee) Digital Voice Systems Inc     

(Current Assignee)
Digital Voice Systems Inc
Joint quantization of speech subframe voicing metrics and fundamental frequencies last frame, replacement frame speech encoder
speech signal, decoder determines concealment speech signal
XXXXXXXXXXXXX
99

JPH10190498A

(Kari Jarvinen, 1998)
(Original Assignee) Nokia Mobile Phones Ltd; ノキア モービル フォーンズ リミテッド     不連続伝送中に快適雑音を発生させる改善された方法 first impulse response, impulse responses 有するベクトル, 周波数応答
other frames 有するフレーム
average pitch value それぞれ独立
onset frame フレーム間
first impulse の平均
pitch period 短期間
energy information parameter 最適化
XXXXXXXXXXXXXXXXXXX
100

US5960389A

(Kari Jarvinen, 1999)
(Original Assignee) Nokia Mobile Phones Ltd     

(Current Assignee)
Nokia Technologies Oy
Methods for generating comfort noise during discontinuous transmission LP filter continuous transmission, background noise
encoding parameters respective parameter
first impulse, impulse responses frequency response
comfort noise, decoder determines concealment CN parameters, comfort noise
pass filter odd number
frame erasure PC mode
XXXXXXXXXXXXXXXXXXXXXXXXX
101

US5884253A

(Willem Bastiaan Kleijn, 1999)
(Original Assignee) Nokia of America Corp     

(Current Assignee)
Nokia of America Corp
Prototype waveform speech coding with interpolation of pitch, pitch-period waveforms, and synthesis filter decoder concealment, decoder recovery second residual signal, first residual signal
conducting concealment communications channel
determining concealment second speech
frame erasure, concealing frame erasure first speech
XXXXXXXXXXXXXXXXXXXXXXXXX
102

WO9912155A1

(Anthony P. Mauro, 1999)
(Original Assignee) Qualcomm Incorporated     Channel gain modification system and method for noise reduction in voice communication determining concealment autocorrelation function
LP filter background noise
XXXXXX
103

US5909663A

(Kazuyuki Iijima, 1999)
(Original Assignee) Sony Corp     

(Current Assignee)
Sony Corp
Speech decoding method and apparatus for selecting random noise codevectors as excitation signals for an unvoiced speech frame decoder determines concealment encoded speech signal
decoder recovery, decoder constructs decoding apparatus
sound signal noise component
average energy detecting step
encoding parameters coding method
XXXXXXXXXXXXXXXXXXXXXXX
104

EP0834863A2

(Ozawa Kazunori, 1998)
(Original Assignee) NEC Corp     

(Current Assignee)
NEC Corp
Speech coder at low bit rates last non predetermined number
sound signal, speech signal adaptive codebook
impulse responses, impulse response impulse responses, second pulses
current frame zero amplitude
average pitch average pitch
LP filter excitation signal represents a
XXXXXXXXXXXXXXXXXXXXXXXXX
105

US5963896A

(Kazunori Ozawa, 1999)
(Original Assignee) NEC Corp     

(Current Assignee)
Rakuten Inc
Speech coder including an excitation quantizer for retrieving positions of amplitude pulses using spectral parameters and different gains for groups of the pulses last non predetermined number
sound signal, speech signal adaptive codebook
impulse responses, impulse response impulse responses, second pulses
current frame zero amplitude
average pitch average pitch
LP filter excitation signal represents a
XXXXXXXXXXXXXXXXXXXXXXXXX
106

US5956672A

(Masahiro Serizawa, 1999)
(Original Assignee) NEC Corp     

(Current Assignee)
NEC Corp
Wide-band speech spectral quantizer placing remaining impulse responses predetermined frequency
determining concealment prediction error
speech signal, decoder determines concealment speech signal
XXXXXXXXX
107

JPH1130997A

(Toshiyuki Nomura, 1999)
(Original Assignee) Nec Corp; 日本電気株式会社     音声符号化復号装置 decoder determines concealment ダウンサンプリング
first impulse 入力音
XX
108

US5924062A

(Tin Maung, 1999)
(Original Assignee) Nokia Mobile Phones Ltd     

(Current Assignee)
Qualcomm Inc
ACLEP codec with modified autocorrelation matrix storage and search first impulse, impulse responses impulse response vector, response signal
first non first mapping
signal classification parameter, signal energy upper portion
speech signal, decoder determines concealment speech signal
frame concealment CELP codec
⁢ E, E q T rows
XXXXXXXXXXXXXXXXXXXXXXXXX
109

US6073092A

(Soon Y. Kwon, 2000)
(Original Assignee) Telogy Networks Inc     

(Current Assignee)
Google Technology Holdings LLC
Method for speech coding based on a code excited linear prediction (CELP) model recovery parameters linear predictive coding
decoder determines concealment, LP filter encoded speech signal, inverse filtering
sound signal, speech signal adaptive codebook
average energy fixed codebook
comfort noise filter output
average pitch value minimum mean
pass filter pass filter
signal energy random code
average pitch, E q high p, pitch p
XXXXXXXXXXXXXXXXXXXXXXXXX
110

US5966689A

(Alan V. McCree, 1999)
(Original Assignee) Texas Instruments Inc     

(Current Assignee)
Texas Instruments Inc
Adaptive filter and filtering method for low bit rate coding current frame, decoder determines concealment successive samples, current frame
first impulse, impulse response unit delay, second filtering
sound signal, signal energy digital signals
concealing frame erasure predicted value
onset frame said signals
maximum amplitude said system
XXXXXXXXXXXXXXXXXXXXXXXXX
111

US5878388A

(Masayuki Nishiguchi, 1999)
(Original Assignee) Sony Corp     

(Current Assignee)
Sony Corp
Voice analysis-synthesis method using noise having diffusion which varies with frequency band to modify predicted phases of transmitted pitch data blocks current frame, decoder determines concealment current frame
speech signal initial phase
XXXXXXXXXXX
112

US5873060A

(Kazunori Ozawa, 1999)
(Original Assignee) NEC Corp     

(Current Assignee)
NEC Corp
Signal coder for wide-band signals current frame zero amplitude
average pitch value judging unit
XXXXXX
113

US6009122A

(Jacky S. Chow, 1999)
(Original Assignee) Amati Communications Corp     

(Current Assignee)
Texas Instruments Inc
Method and apparatus for superframe bit allocation LP filter, LP filter excitation signal digital frequency
sound signal, signal energy digital signals
current frame analog signal
XXXXXXXXXXXXXXXXXXXXXXXXX
114

US5953697A

(Chin-Teng Lin, 1999)
(Original Assignee) Holtek Semiconductor Inc     

(Current Assignee)
Holtek Semiconductor Inc
Gain estimation scheme for LPC vocoders with a shape index based on signal envelopes recovery parameters linear predictive coding, repeating step
decoder constructs encoded parameter
placing remaining impulse responses codebook approach
impulse responses, impulse response impulse response
frame concealment previous frames
XXXXXXXX
115

US6023672A

(Kazunori Ozawa, 2000)
(Original Assignee) NEC Corp     

(Current Assignee)
NEC Corp
Speech coder current frame zero amplitude
speech signal, decoder determines concealment speech signal, speech coder
encoding parameters coding method
average pitch value judging unit
XXXXXXXXXXXXX
116

JPH10282997A

(Toshiyuki Nomura, 1998)
(Original Assignee) Nec Corp; 日本電気株式会社     音声符号化装置及び復号装置 maximum amplitude 有すること
sound signal, speech signal 音声信号, 音声復号
XXXXXXXXXXXXXXXXXXXXXXX
117

US5907822A

(Jaime L. Prieto, 1999)
(Original Assignee) Lincom Corp     

(Current Assignee)
Engility LLC
Loss tolerant speech decoder for telecommunications first impulse energy characteristics
impulse responses, impulse response impulse response
speech signal, decoder determines concealment speech signal
XXXXXXXXXXXXXXX
118

US6122607A

(Erik Ekudden, 2000)
(Original Assignee) Telefonaktiebolaget LM Ericsson AB     

(Current Assignee)
Telefonaktiebolaget LM Ericsson AB
Method and arrangement for reconstruction of a received speech signal sound signal, signal classification parameter predetermined quality, speech signal
last non predetermined number
encoding parameters contains information
maximum amplitude control signal
impulse response inverse filter
LP filter excitation signal represents a
pass filter pass filter
frame concealment error rate
average energy first sum
XXXXXXXXXXXXXXXXXXXXXXXXX
119

JPH10260698A

(Kazunori Ozawa, 1998)
(Original Assignee) Nec Corp; 日本電気株式会社     信号符号化装置 sound signal, speech signal 音声信号
energy information parameter 最適化
XXXXXXXXXXXXXXXXXXXXXXX
120

US6170073B1

(Kari Jarvinen, 2001)
(Original Assignee) Nokia Mobile Phones UK Ltd     

(Current Assignee)
Nokia Oyj ; Intellectual Ventures I LLC
Method and apparatus for error detection in digital communications last frame, replacement frame speech encoder
decoder concealment, determining concealment bad frame
XXXXXX
121

US6292834B1

(Hemanth Srinivas Ravi, 2001)
(Original Assignee) Microsoft Corp     

(Current Assignee)
Microsoft Technology Licensing LLC
Dynamic bandwidth selection for efficient transmission of multimedia streams in a computer network encoding parameters data transmission rate
decoder recovery, decoder determines concealment incoming data
onset frame first time
XXXXXXXXXXXXXXXX
122

JPH10232699A

(Akihiro Nakahara, 1998)
(Original Assignee) Japan Radio Co Ltd; 日本無線株式会社     Lpcボコーダ recovery parameters, phase information parameter 特徴パラメータ
onset frame フレーム間
sound signal, speech signal 音声信号
XXXXXXXXXXXXXXXXXXXXXXX
123

US6317714B1

(Leonardo Del Castillo, 2001)
(Original Assignee) Microsoft Corp     

(Current Assignee)
Chartoleaux KG LLC
Controller and associated mechanical characters operable for continuously performing received control data while engaging in bidirectional communications over a single communications channel recovery parameters linear predictive coding, repeating step
onset frame first time
E q period T
XXXXXX
124

US5819213A

(Masahiro Oshikiri, 1998)
(Original Assignee) Toshiba Corp     

(Current Assignee)
Toshiba Corp
Speech encoding and decoding with pitch filter range unrestricted by codebook range and preselecting, then increasing, search candidates from linear overlap codebooks last non predetermined number
decoder recovery, decoder constructs decoding apparatus
sound signal, speech signal adaptive codebook
frame concealment, decoder determines concealment residual error
encoding parameters coding method
XXXXXXXXXXXXXXXXXXXXXXXXX
125

US6085158A

(Nobuhiko Naka, 2000)
(Original Assignee) NTT Mobile Communications Networks Inc     

(Current Assignee)
NTT Docomo Inc
Updating internal states of a speech decoder after errors have occurred decoder concealment, frame erasure concealment error concealment
last frame, replacement frame speech encoder
XXXXXXXXXXXXXXXXXXX
126

JPH10233692A

(Masaaki Isozaki, 1998)
(Original Assignee) Sony Corp; ソニー株式会社     オーディオ信号符号化装置および符号化方法並びにオーディオ信号復号装置および復号方法 maximum amplitude 有すること
preceding impulse response 選択指示
decoder concealment, frame erasure concealment エラー
current frame, replacement frame ワーク
XXXXXXXXXXXXXXXXXXXXXX
127

US5806024A

(Kazunori Ozawa, 1998)
(Original Assignee) NEC Corp     

(Current Assignee)
NEC Corp
Coding of a speech or music signal with quantization of harmonics components specifically and then residue components comfort noise linear prediction coefficient
impulse responses, impulse response impulse responses, response signal
onset frame successive frames
first impulse filtered signal
encoding parameters coding method
XXXXXXXX
128

US6173265B1

(Hidetaka Takahashi, 2001)
(Original Assignee) Olympus Corp     

(Current Assignee)
Olympus Corp
Voice recording and/or reproducing method and apparatus for reducing a deterioration of a voice signal due to a change over from one coding device to another coding device recovery parameters linear predictive coding
sound signal, speech signal adaptive codebook
pitch period coding device
XXXXXXXXXXXXXXXXXXXXXXX
129

JPH09321783A

(Noriaki Kono, 1997)
(Original Assignee) Mitsubishi Electric Corp; 三菱電機株式会社     音声符号化伝送システム speech signal の音声信号
determining concealment 決定手段
decoder recovery 検知器
signal classification parameter 値決定
concealing frame erasure ない時
XXXXXXXXXXXXXXXXXXXXXXX
130

US5819212A

(Jun Matsumoto, 1998)
(Original Assignee) Sony Corp     

(Current Assignee)
Sony Corp
Voice encoding method and apparatus using modified discrete cosine transform comfort noise linear prediction coefficient
first non, last non said transmission
communication link, decoder determines concealment discrete cosine, speech signal
encoding parameters coding method
XXXXXXXXXXXXXXXXXXXX
131

US5890108A

(Suat Yeldener, 1999)
(Original Assignee) Voxware Inc     

(Current Assignee)
Voxware Inc
Low bit-rate speech coding system and method using voicing probability determination recovery parameters linear predictive coding
encoding parameters encoding parameters
first impulse, first impulse response weighting function, frequency response
speech signal, decoder determines concealment speech signal
frame erasure PC mode
energy information parameter low end
XXXXXXXXXXXXXXXXXXXXXXXXX
132

US6064962A

(Masahiro Oshikiri, 2000)
(Original Assignee) Toshiba Corp     

(Current Assignee)
Toshiba Corp
Formant emphasis method and formant emphasis filter device average pitch value control device
current frame, decoder determines concealment current frame, pitch period
XXXXXXXXXX
133

JPH1069297A

(Kazunori Ozawa, 1998)
(Original Assignee) Nec Corp; 日本電気株式会社     音声符号化装置 comfort noise 切替えること
maximum amplitude 有すること
sound signal, speech signal 音声信号
XXXXXXXXXXXXXXXXXXXXXXX
134

US6092041A

(Davis Pan, 2000)
(Original Assignee) Motorola Solutions Inc     

(Current Assignee)
Google Technology Holdings LLC
System and method of encoding and decoding a layered bitstream by re-applying psychoacoustic analysis in the decoder comfort noise programmable gate array
decoder recovery decoded bits
XX
135

JPH1039898A

(Toshihiro Hayata, 1998)
(Original Assignee) Nec Corp; 日本電気株式会社     音声信号伝送方法及び音声符号復号化システム speech signal の音声信号, 音声復号
first impulse 入力音
pitch period 期間中
XXXXXXXXXXXXX
136

US5809456A

(Silvio Cucchi, 1998)
(Original Assignee) Alcatel Lucent Italia SpA     

(Current Assignee)
Alcatel Lucent Italia SpA
Voiced speech coding and decoding using phase-adapted single excitation LP filter, LP filter excitation signal autocorrelation coefficients
recovery parameters linear predictive coding
comfort noise filter output
encoding parameters coding method
first non time t
E q when p
XXXXXXXXXXXX
137

US5819298A

(Thomas K. Wong, 1998)
(Original Assignee) Sun Microsystems Inc     

(Current Assignee)
Oracle America Inc
File allocation tables with holes LP filter excitation signal represents a, when i
first non first one
XXXXXXXXXXXX
138

US6029128A

(Kari Jarvinen, 2000)
(Original Assignee) Nokia Mobile Phones Ltd     

(Current Assignee)
Nokia Technologies Oy
Speech synthesizer ⁢ E following relationships
sound signal, speech signal adaptive codebook, speech signal
average energy fixed codebook, linear scale
XXXXXXXXXXXXXXXXXXXXXXXXX
139

JPH09120298A

(Peter Kroon, 1997)
(Original Assignee) At & T Ipm Corp; エイ・ティ・アンド・ティ・アイピーエム・コーポレーション     フレーム消失の間の音声復号に使用する音声の有声/無声分類 speech signal 音声復号方法, の音声信号
maximum amplitude 有すること
XXXXXXXXXXX
140

US5884252A

(Kazunori Ozawa, 1999)
(Original Assignee) NEC Corp     

(Current Assignee)
Rakuten Inc
Method of and apparatus for coding speech signal sound signal, speech signal adaptive codebook
impulse responses, impulse response impulse response, response signal
LP filter excitation signal input terminal, represents a
XXXXXXXXXXXXXXXXXXXXXXXXX
141

EP0747882A2

(Yair Shoham, 1996)
(Original Assignee) AT&T Corp; AT&T IPM Corp     

(Current Assignee)
AT&T Corp
Pitch delay modification during frame erasures sound signal, speech signal adaptive codebook, speech signal
LP filter comprises i
XXXXXXXXXXXXXXXXXXXXXXXXX
142

US5845244A

(Stephane Proust, 1998)
(Original Assignee) France Telecom SA     

(Current Assignee)
Orange SA
Adapting noise masking level in analysis-by-synthesis employing perceptual weighting comfort noise linear prediction coefficient
recovery parameters reflection coefficients
onset frame successive frames
speech signal, decoder determines concealment speech signal
encoding parameters coding method
XXXXXXXXX
143

JPH09281996A

(Kazuyuki Iijima, 1997)
(Original Assignee) Sony Corp; ソニー株式会社     有声音/無声音判定方法及び装置、並びに音声符号化方法 conducting frame erasure concealment の少なくとも1
decoder determines concealment 音声符号化方法
maximum amplitude 有すること
sound signal, speech signal 音声信号
first impulse 入力音
XXXXXXXXXXXXXXXXXXXXXXX
144

US5778335A

(Anil Wamanrao Ubale, 1998)
(Original Assignee) University of California     

(Current Assignee)
University of California
Method and apparatus for efficient multiband celp wideband speech and music coding and decoding comfort noise linear prediction coefficient
sound signal, speech signal adaptive codebook
XXXXXXXXXXXXXXXXXXXXXXX
145

US5717822A

(Juin-Hwey Chen, 1998)
(Original Assignee) Nokia of America Corp     

(Current Assignee)
Nokia of America Corp
Computational complexity reduction during frame erasure of packet loss comfort noise linear prediction coefficient
LP filter, LP filter excitation signal autocorrelation coefficients
impulse responses signal samples
XXXXXXXXXX
146

US6006175A

(John F. Holzrichter, 1999)
(Original Assignee) University of California     

(Current Assignee)
Lawrence Livermore National Security LLC
Methods and apparatus for non-acoustic speech characterization and recognition phase information parameter boundary condition
first impulse response, impulse response sampling time
signal classification parameter normal sound
last subframe end time
XXXXXXXXXXXXXXXXXXXXX
147

JPH09185397A

(秀享 ▲高▼橋, 1997)
(Original Assignee) Olympus Optical Co Ltd; オリンパス光学工業株式会社     音声情報記録装置 frame erasure, frame erasure concealment 線形予測符号化
speech signal 音声情報, する記録
XXXXXXXXXXXXXXXXXXXXXXXXX
148

US5819217A

(Vijay Rangan Raman, 1998)
(Original Assignee) Bell Atlantic Science and Technology Inc     

(Current Assignee)
Verizon Patent and Licensing Inc
Method and system for differentiating between speech and noise average energy average energy
last frame last frame
XXXXXXXXX
149

US5673363A

(Byeungwoo Jeon, 1997)
(Original Assignee) Samsung Electronics Co Ltd     

(Current Assignee)
Samsung Electronics Co Ltd
Error concealment method and apparatus of audio signals last non predetermined number
signal classification parameter preceding frame
XXXXXXXXXXXXXXXXXXXXX
150

US5745871A

(Juin-Hwey Chen, 1998)
(Original Assignee) Nokia of America Corp     

(Current Assignee)
Nokia of America Corp
Pitch period estimation for use with audio coders signal classification parameter preceding frame
speech signal, decoder determines concealment speech signal
XXXXXXXXXXXXXXXXXXXXX
151

US5799276A

(Edward Komissarchik, 1998)
(Original Assignee) Accent Inc     

(Current Assignee)
Rosetta Stone Ltd
Knowledge-based speech recognition system and methods having frame length computed based upon estimated pitch period of vocalic intervals speech signal acoustic characteristics
decoder concealment, pitch period pitch period
XXXXXXXXXXXXX
152

US5596676A

(Kumar Swaminathan, 1997)
(Original Assignee) Hughes Electronics Corp     

(Current Assignee)
JPMorgan Chase Bank NA ; Hughes Network Systems LLC
Mode-specific method and apparatus for encoding signals containing speech encoding parameters coding method
last non second line
XXXXXX
153

US5835495A

(Philippe Ferriere, 1998)
(Original Assignee) Microsoft Corp     

(Current Assignee)
Microsoft Technology Licensing LLC
System and method for scaleable streamed audio transmission over a network controlling energy digital audio samples
signal classification parameter selecting step
onset frame audio frames
LP filter excitation signal represents a
XXXXXXXXXXXXXXXXXXXXX
154

US5774839A

(Eyal Shlomot, 1998)
(Original Assignee) Rockwell International Corp     

(Current Assignee)
Nytell Software LLC
Delayed decision switched prediction multi-stage LSF vector quantization recovery parameters predetermined set
impulse response distance measure
encoding parameters new set
XXXXXX
155

US5704003A

(Willem Bastiaan Kleijn, 1997)
(Original Assignee) Nokia of America Corp     

(Current Assignee)
Nokia of America Corp
RCELP coder signal classification parameter preceding frame
average energy average energy
current frame, decoder determines concealment current frame
encoding parameters coding method
XXXXXXXXXXXXXXXXXXX
156

US5774837A

(Suat Yeldener, 1998)
(Original Assignee) Voxware Inc     

(Current Assignee)
Voxware Inc
Speech coding system and method using voicing probability determination recovery parameters linear predictive coding
phase information parameter boundary condition
LP filter excitation signal synthesized signal
first impulse, first impulse response generating filter
decoder constructs encoded parameter
determining concealment prediction error
speech signal, decoder determines concealment speech signal, initial phase
energy information parameter low end
XXXXXXXXXXXXXXXXXXXXXXXXX
157

US5749065A

(Masayuki Nishiguchi, 1998)
(Original Assignee) Sony Corp     

(Current Assignee)
Sony Corp
Speech encoding method, speech decoding method and speech encoding/decoding method recovery parameters linear predictive coding
encoding parameters respective parameter
decoder recovery, decoder constructs decoding apparatus
LP filter signal parameters
last subframe fixed number
XXXXXXXXXXXXXXXXXX
158

US5724433A

(A. Maynard Engebretson, 1998)
(Original Assignee) K/S Himpp     

(Current Assignee)
HIMPP K/S ; K/S Himpp
Adaptive gain and filtering circuit for a sound reproduction system first impulse filtered signal
maximum amplitude control signal
XXXXXX
159

US5668925A

(Joseph Harvey Rothweiler, 1997)
(Original Assignee) Martin Marietta Corp     

(Current Assignee)
RETRO REFLECTIVE OPTICS
Low data rate speech encoder with mixed excitation first impulse generating codewords
average pitch value average pitch value
speech signal, decoder determines concealment speech signal
onset frame said signals
maximum amplitude said system
recovery parameters said path
E q pitch p
XXXXXXXXX
160

US5781880A

(Huan-Yu Su, 1998)
(Original Assignee) Rockwell International Corp     

(Current Assignee)
ROCKWELLSCIENCE CENTER Inc ; WIAV Solutions LLC
Pitch lag estimation using frequency-domain lowpass filtering of the linear predictive coding (LPC) residual recovery parameters linear predictive coding
E q weighted average
determining concealment prediction error
encoding parameters coding method
last subframe, last frame output means
pass filter pass filter, odd number
average energy point D
XXXXXXXXXXXXX
161

JPH08263098A

(Kazunaga Ikeda, 1996)
(Original Assignee) Nippon Telegr & Teleph Corp <Ntt>; 日本電信電話株式会社     音響信号符号化方法、音響信号復号化方法 signal energy 各フレームごと
maximum amplitude 有すること
decoder concealment 変換復号化
XXXXXXX
162

US5699478A

(Dror Nahumi, 1997)
(Original Assignee) Nokia of America Corp     

(Current Assignee)
Nokia of America Corp
Frame erasure compensation technique energy information parameter, phase information parameter coded representation
last non predetermined number
frame erasure frame erasure
encoding parameters coding method
XXXXXXXXXXXXXXXXXXXXXXXXX
163

US5884010A

(Juin-Hwey Chen, 1999)
(Original Assignee) Nokia of America Corp     

(Current Assignee)
Evonik Goldschmidt GmbH ; Nokia of America Corp
Linear prediction coefficient generation during frame erasure or packet loss impulse responses signal samples
LP filter comprises i
replacement frame time lag
E q pitch p
XXXXXXXX
164

EP0691751A1

(Makoto Mitsuno, 1996)
(Original Assignee) Sony Corp     

(Current Assignee)
Sony Corp
Method and device for compressing information, method and device for expanding compressed information, device for recording/transmitting compressed information, device for receiving compressed information, and recording medium recovery parameters receiving apparatus
maximum amplitude signal component, based signal
pitch period time length
controlling energy rising time
XXXXXXX
165

US5699477A

(Alan V. McCree, 1997)
(Original Assignee) Texas Instruments Inc     

(Current Assignee)
Texas Instruments Inc
Mixed excitation linear prediction with fractional pitch comfort noise linear prediction coefficient
determining concealment predictive filter
pitch period, signal classification parameter input sound
XXXXXXXXXXXXXXXXXXXXX
166

US5717818A

(Yoshito Nejime, 1998)
(Original Assignee) Hitachi Ltd     

(Current Assignee)
Hitachi Ltd
Audio signal storing apparatus having a function for converting speech speed LP filter excitation signal input terminal
speech signal, decoder determines concealment speech signal
current frame analog signal
last subframe, last frame output means
pitch period time length
pass filter pass filter
determining concealment time span
replacement frame time lag
first non time t
XXXXXXXXXXXXXXXXXXXXX
167

US5787387A

(Joseph Gerard Aguilar, 1998)
(Original Assignee) Voxware Inc     

(Current Assignee)
Google LLC
Harmonic adaptive speech coding method and system recovery parameters linear predictive coding
phase information parameter boundary condition
LP filter excitation signal synthesized signal, represents a
determining concealment prediction error
speech signal, decoder determines concealment speech signal, initial phase
XXXXXXXXXXXXXXXXXXXXXXX
168

US5664051A

(John C. Hardwick, 1997)
(Original Assignee) Digital Voice Systems Inc     

(Current Assignee)
Digital Voice Systems Inc
Method and apparatus for phase synthesis for speech processing last frame, replacement frame speech encoder
speech signal, decoder determines concealment speech signal
XXXXXXXXXXXXX
169

US5598506A

(Karl T. Wigren, 1997)
(Original Assignee) Telefonaktiebolaget LM Ericsson AB     

(Current Assignee)
Telefonaktiebolaget LM Ericsson AB
Apparatus and a method for concealing transmission errors in a speech decoder decoder concealment, decoder recovery transmission error
pass filter pass filter
LP filter comprises i
XXXXXXXXXXXXXXXXXX
170

US5734789A

(Kumar Swaminathan, 1998)
(Original Assignee) Hughes Electronics Corp     

(Current Assignee)
JPMorgan Chase Bank NA ; Hughes Network Systems LLC
Voiced, unvoiced or noise modes in a CELP vocoder current frame characteristic value
average energy second intermediate
encoding parameters coding method
determining concealment second speech
frame erasure, concealing frame erasure first speech
comfort noise high value
XXXXXXXXXXXXXXXXXXXXXXXXX
171

US5717823A

(Willem Bastiaan Kleijn, 1998)
(Original Assignee) Nokia of America Corp     

(Current Assignee)
Nokia of America Corp
Speech-rate modification for linear-prediction based analysis-by-synthesis speech coders sound signal, speech signal adaptive codebook
maximum amplitude control signal
average energy fixed codebook
decoder determines concealment speech coder
XXXXXXXXXXXXXXXXXXXXXXX
172

US5450449A

(Peter Kroon, 1995)
(Original Assignee) AT&T IPM Corp     

(Current Assignee)
AT&T Corp ; Nokia of America Corp
Linear prediction coefficient generation during frame erasure or packet loss comfort noise linear prediction coefficient
signal classification parameter linear prediction filter
first impulse, impulse responses frequency response
speech signal, decoder determines concealment speech signal
frame erasure frame erasure
XXXXXXXXXXXXXXXXXXXXXXXXX
173

US5574825A

(Juin-Hwey Chen, 1996)
(Original Assignee) Nokia of America Corp     

(Current Assignee)
Nokia of America Corp
Linear prediction coefficient generation during frame erasure or packet loss comfort noise linear prediction coefficient
signal classification parameter linear prediction filter
speech signal, decoder determines concealment speech signal
frame erasure frame erasure
XXXXXXXXXXXXXXXXXXXXXXXXX
174

US5517595A

(Willem B. Kleijn, 1996)
(Original Assignee) AT&T Corp     

(Current Assignee)
AT&T Corp
Decomposition in noise and periodic signal waveforms in waveform interpolation phase information parameter determining parameters
E q weighted average
LP filter, LP filter excitation signal domain samples, represents a
average energy fixed codebook
impulse responses signal samples
onset frame said signals
XXXXXXXXXXXXXXXXXXXXX
175

US5862518A

(Toshiyuki Nomura, 1999)
(Original Assignee) NEC Corp     

(Current Assignee)
NEC Corp
Speech decoder for decoding a speech signal using a bad frame masking unit for voiced frame and a bad frame masking unit for unvoiced frame sound signal, speech signal adaptive codebook
last frame, current frame current frames
decoder concealment, pitch period pitch period, error frame
average pitch value judging unit
XXXXXXXXXXXXXXXXXXXXXXXXX
176

US5717824A

(Harprit S. Chhatwal, 1998)
(Original Assignee) Pacific Communication Sciences Inc     

(Current Assignee)
Cirrus Logic Inc ; Mindspeed Technologies LLC ; AudioCodes Inc
Adaptive speech coder having code excited linear predictor with multiple codebook searches sound signal, speech signal adaptive codebook
determining concealment second speech
frame erasure, concealing frame erasure first speech
pass filter pass filter
frame concealment error value
average pitch, E q high p
XXXXXXXXXXXXXXXXXXXXXXXXX




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V. : 1591-1594 1997

Publication Year: 1997

MELP: The New Federal Standard At 2400 Bps

United States Department of Defense, Meade, MD, USA

Supplee, Cohn, Collura, Mccree, Ieee Comp Soc
US7693710B2
CLAIM 1
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (PC mode) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (PC mode) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (PC mode) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (PC mode) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (PC mode) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (PC mode) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (PC mode) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (PC mode) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
MELP : The New Federal Standard At 2400 Bps . This paper describes the new U . S . Federal Standard at 2400 bps . The Mixed Excitation Linear Prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) . This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 . The MELP coder is based on the traditional LPC mode (frame erasure) l , but includes additional features to improve its performance .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
2002 IEEE SPEECH CODING WORKSHOP PROCEEDINGS. : 144-146 2002

Publication Year: 2002

The Adaptive Multi-rate Wideband Codec: History And Performance

VoiceAge Corporation

Salami, Bessette, Lefebvre, Jelinek, Rotola-pukkila, Vainio, Mikkola, Jarvinen, Ieee, Ieee
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (first time) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
The Adaptive Multi-rate Wideband Codec : History And Performance . This paper gives the history and performance of the Adaptive Multi-Rate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services . The AMR-WB speech codec algorithm was selected in December 2000 , and the corresponding specifications were approved in March 2001 . In July 2001 , the AMR-WB codec was also selected by ITU-T in the standardization activity for wideband speech coding around 16 kbit/s . The adoption of AMR-WB by ITU-T is of significant importance since for the first time (onset frame) the same codec is adopted for wireless as well as wireline services . AMR-WB uses an extended audio bandwidth from 3 . 4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to 2(nd) and 3(rd) generation mobile communication systems .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (time t) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
The Adaptive Multi-rate Wideband Codec : History And Performance . This paper gives the history and performance of the Adaptive Multi-Rate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services . The AMR-WB speech codec algorithm was selected in December 2000 , and the corresponding specifications were approved in March 2001 . In July 2001 , the AMR-WB codec was also selected by ITU-T in the standardization activity for wideband speech coding around 16 kbit/s . The adoption of AMR-WB by ITU-T is of significant importance since for the first time t (first non) he same codec is adopted for wireless as well as wireline services . AMR-WB uses an extended audio bandwidth from 3 . 4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to 2(nd) and 3(rd) generation mobile communication systems .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (time t) erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
The Adaptive Multi-rate Wideband Codec : History And Performance . This paper gives the history and performance of the Adaptive Multi-Rate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services . The AMR-WB speech codec algorithm was selected in December 2000 , and the corresponding specifications were approved in March 2001 . In July 2001 , the AMR-WB codec was also selected by ITU-T in the standardization activity for wideband speech coding around 16 kbit/s . The adoption of AMR-WB by ITU-T is of significant importance since for the first time t (first non) he same codec is adopted for wireless as well as wireline services . AMR-WB uses an extended audio bandwidth from 3 . 4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to 2(nd) and 3(rd) generation mobile communication systems .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (time t) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
The Adaptive Multi-rate Wideband Codec : History And Performance . This paper gives the history and performance of the Adaptive Multi-Rate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services . The AMR-WB speech codec algorithm was selected in December 2000 , and the corresponding specifications were approved in March 2001 . In July 2001 , the AMR-WB codec was also selected by ITU-T in the standardization activity for wideband speech coding around 16 kbit/s . The adoption of AMR-WB by ITU-T is of significant importance since for the first time t (first non) he same codec is adopted for wireless as well as wireline services . AMR-WB uses an extended audio bandwidth from 3 . 4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to 2(nd) and 3(rd) generation mobile communication systems .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
The Adaptive Multi-rate Wideband Codec : History And Performance . This paper gives the history and performance of the Adaptive Multi-Rate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services . The AMR-WB speech codec algorithm was selected in December 2000 , and the corresponding specifications were approved in March 2001 . In July 2001 , the AMR-WB codec was also selected by ITU-T in the standardization activity for wideband speech coding around 16 kbit/s . The adoption of AMR-WB by ITU-T is of significant importance since for the first time t (first non) he same codec is adopted for wireless as well as wireline services . AMR-WB uses an extended audio bandwidth from 3 . 4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to 2(nd) and 3(rd) generation mobile communication systems .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non (time t) erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
The Adaptive Multi-rate Wideband Codec : History And Performance . This paper gives the history and performance of the Adaptive Multi-Rate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services . The AMR-WB speech codec algorithm was selected in December 2000 , and the corresponding specifications were approved in March 2001 . In July 2001 , the AMR-WB codec was also selected by ITU-T in the standardization activity for wideband speech coding around 16 kbit/s . The adoption of AMR-WB by ITU-T is of significant importance since for the first time t (first non) he same codec is adopted for wireless as well as wireline services . AMR-WB uses an extended audio bandwidth from 3 . 4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to 2(nd) and 3(rd) generation mobile communication systems .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
The Adaptive Multi-rate Wideband Codec : History And Performance . This paper gives the history and performance of the Adaptive Multi-Rate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services . The AMR-WB speech codec algorithm was selected in December 2000 , and the corresponding specifications were approved in March 2001 . In July 2001 , the AMR-WB codec was also selected by ITU-T in the standardization activity for wideband speech coding around 16 kbit/s . The adoption of AMR-WB by ITU-T is of significant importance since for the first time t (first non) he same codec is adopted for wireless as well as wireline services . AMR-WB uses an extended audio bandwidth from 3 . 4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to 2(nd) and 3(rd) generation mobile communication systems .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (first time) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
The Adaptive Multi-rate Wideband Codec : History And Performance . This paper gives the history and performance of the Adaptive Multi-Rate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services . The AMR-WB speech codec algorithm was selected in December 2000 , and the corresponding specifications were approved in March 2001 . In July 2001 , the AMR-WB codec was also selected by ITU-T in the standardization activity for wideband speech coding around 16 kbit/s . The adoption of AMR-WB by ITU-T is of significant importance since for the first time (onset frame) the same codec is adopted for wireless as well as wireline services . AMR-WB uses an extended audio bandwidth from 3 . 4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to 2(nd) and 3(rd) generation mobile communication systems .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (time t) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
The Adaptive Multi-rate Wideband Codec : History And Performance . This paper gives the history and performance of the Adaptive Multi-Rate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services . The AMR-WB speech codec algorithm was selected in December 2000 , and the corresponding specifications were approved in March 2001 . In July 2001 , the AMR-WB codec was also selected by ITU-T in the standardization activity for wideband speech coding around 16 kbit/s . The adoption of AMR-WB by ITU-T is of significant importance since for the first time t (first non) he same codec is adopted for wireless as well as wireline services . AMR-WB uses an extended audio bandwidth from 3 . 4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to 2(nd) and 3(rd) generation mobile communication systems .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (time t) erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
The Adaptive Multi-rate Wideband Codec : History And Performance . This paper gives the history and performance of the Adaptive Multi-Rate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services . The AMR-WB speech codec algorithm was selected in December 2000 , and the corresponding specifications were approved in March 2001 . In July 2001 , the AMR-WB codec was also selected by ITU-T in the standardization activity for wideband speech coding around 16 kbit/s . The adoption of AMR-WB by ITU-T is of significant importance since for the first time t (first non) he same codec is adopted for wireless as well as wireline services . AMR-WB uses an extended audio bandwidth from 3 . 4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to 2(nd) and 3(rd) generation mobile communication systems .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (time t) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
The Adaptive Multi-rate Wideband Codec : History And Performance . This paper gives the history and performance of the Adaptive Multi-Rate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services . The AMR-WB speech codec algorithm was selected in December 2000 , and the corresponding specifications were approved in March 2001 . In July 2001 , the AMR-WB codec was also selected by ITU-T in the standardization activity for wideband speech coding around 16 kbit/s . The adoption of AMR-WB by ITU-T is of significant importance since for the first time t (first non) he same codec is adopted for wireless as well as wireline services . AMR-WB uses an extended audio bandwidth from 3 . 4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to 2(nd) and 3(rd) generation mobile communication systems .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
The Adaptive Multi-rate Wideband Codec : History And Performance . This paper gives the history and performance of the Adaptive Multi-Rate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services . The AMR-WB speech codec algorithm was selected in December 2000 , and the corresponding specifications were approved in March 2001 . In July 2001 , the AMR-WB codec was also selected by ITU-T in the standardization activity for wideband speech coding around 16 kbit/s . The adoption of AMR-WB by ITU-T is of significant importance since for the first time t (first non) he same codec is adopted for wireless as well as wireline services . AMR-WB uses an extended audio bandwidth from 3 . 4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to 2(nd) and 3(rd) generation mobile communication systems .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non (time t) erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
The Adaptive Multi-rate Wideband Codec : History And Performance . This paper gives the history and performance of the Adaptive Multi-Rate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services . The AMR-WB speech codec algorithm was selected in December 2000 , and the corresponding specifications were approved in March 2001 . In July 2001 , the AMR-WB codec was also selected by ITU-T in the standardization activity for wideband speech coding around 16 kbit/s . The adoption of AMR-WB by ITU-T is of significant importance since for the first time t (first non) he same codec is adopted for wireless as well as wireline services . AMR-WB uses an extended audio bandwidth from 3 . 4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to 2(nd) and 3(rd) generation mobile communication systems .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
The Adaptive Multi-rate Wideband Codec : History And Performance . This paper gives the history and performance of the Adaptive Multi-Rate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services . The AMR-WB speech codec algorithm was selected in December 2000 , and the corresponding specifications were approved in March 2001 . In July 2001 , the AMR-WB codec was also selected by ITU-T in the standardization activity for wideband speech coding around 16 kbit/s . The adoption of AMR-WB by ITU-T is of significant importance since for the first time t (first non) he same codec is adopted for wireless as well as wireline services . AMR-WB uses an extended audio bandwidth from 3 . 4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to 2(nd) and 3(rd) generation mobile communication systems .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING. 3 (1): 59-71 JAN 1995

Publication Year: 1995

ADAPTIVE POSTFILTERING FOR QUALITY ENHANCEMENT OF CODED SPEECH

AT&T Bell Labs

Chen, Gersho
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (first time) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (frequency response) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (frequency response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
ADAPTIVE POSTFILTERING FOR QUALITY ENHANCEMENT OF CODED SPEECH . An adaptive postfiltering algorithm for enhancing the perceptual quality of coded speech is presented , The postfilter consists of a long-term postfilter section in cascade with a shortterm postfilter section and includes spectral tilt compensation and automatic gain control , The long-term section emphasizes pitch harmonics and attenuates the spectral valleys between pitch harmonics , The short-term section , on the other hand , emphasizes speech formants and attenuates the spectral valleys between formants , Both filter sections have poles and zeros , Unlike earlier postfilters that often introduced a substantial amount of muffling to the output speech , our postfilter significantly reduces this effect by minimizing the spectral tilt in its frequency response (first impulse, impulse responses) , As a result , this postfilter achieves noticeable noise reduction while introducing only minimal distortion in speech , The complexity of the postfilter is quite low . Variations of this postfilter are now being used in several national and international speech coding standards , This paper presents for the first time (onset frame) a complete description of our original postfiltering algorithm and the underlying ideas that motivated its development .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link (automatic gain control) for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (first time) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (frequency response) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (frequency response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
ADAPTIVE POSTFILTERING FOR QUALITY ENHANCEMENT OF CODED SPEECH . An adaptive postfiltering algorithm for enhancing the perceptual quality of coded speech is presented , The postfilter consists of a long-term postfilter section in cascade with a shortterm postfilter section and includes spectral tilt compensation and automatic gain control (communication link) , The long-term section emphasizes pitch harmonics and attenuates the spectral valleys between pitch harmonics , The short-term section , on the other hand , emphasizes speech formants and attenuates the spectral valleys between formants , Both filter sections have poles and zeros , Unlike earlier postfilters that often introduced a substantial amount of muffling to the output speech , our postfilter significantly reduces this effect by minimizing the spectral tilt in its frequency response (first impulse, impulse responses) , As a result , this postfilter achieves noticeable noise reduction while introducing only minimal distortion in speech , The complexity of the postfilter is quite low . Variations of this postfilter are now being used in several national and international speech coding standards , This paper presents for the first time (onset frame) a complete description of our original postfiltering algorithm and the underlying ideas that motivated its development .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (automatic gain control) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
ADAPTIVE POSTFILTERING FOR QUALITY ENHANCEMENT OF CODED SPEECH . An adaptive postfiltering algorithm for enhancing the perceptual quality of coded speech is presented , The postfilter consists of a long-term postfilter section in cascade with a shortterm postfilter section and includes spectral tilt compensation and automatic gain control (communication link) , The long-term section emphasizes pitch harmonics and attenuates the spectral valleys between pitch harmonics , The short-term section , on the other hand , emphasizes speech formants and attenuates the spectral valleys between formants , Both filter sections have poles and zeros , Unlike earlier postfilters that often introduced a substantial amount of muffling to the output speech , our postfilter significantly reduces this effect by minimizing the spectral tilt in its frequency response , As a result , this postfilter achieves noticeable noise reduction while introducing only minimal distortion in speech , The complexity of the postfilter is quite low . Variations of this postfilter are now being used in several national and international speech coding standards , This paper presents for the first time a complete description of our original postfiltering algorithm and the underlying ideas that motivated its development .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (automatic gain control) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
ADAPTIVE POSTFILTERING FOR QUALITY ENHANCEMENT OF CODED SPEECH . An adaptive postfiltering algorithm for enhancing the perceptual quality of coded speech is presented , The postfilter consists of a long-term postfilter section in cascade with a shortterm postfilter section and includes spectral tilt compensation and automatic gain control (communication link) , The long-term section emphasizes pitch harmonics and attenuates the spectral valleys between pitch harmonics , The short-term section , on the other hand , emphasizes speech formants and attenuates the spectral valleys between formants , Both filter sections have poles and zeros , Unlike earlier postfilters that often introduced a substantial amount of muffling to the output speech , our postfilter significantly reduces this effect by minimizing the spectral tilt in its frequency response , As a result , this postfilter achieves noticeable noise reduction while introducing only minimal distortion in speech , The complexity of the postfilter is quite low . Variations of this postfilter are now being used in several national and international speech coding standards , This paper presents for the first time a complete description of our original postfiltering algorithm and the underlying ideas that motivated its development .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (automatic gain control) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
ADAPTIVE POSTFILTERING FOR QUALITY ENHANCEMENT OF CODED SPEECH . An adaptive postfiltering algorithm for enhancing the perceptual quality of coded speech is presented , The postfilter consists of a long-term postfilter section in cascade with a shortterm postfilter section and includes spectral tilt compensation and automatic gain control (communication link) , The long-term section emphasizes pitch harmonics and attenuates the spectral valleys between pitch harmonics , The short-term section , on the other hand , emphasizes speech formants and attenuates the spectral valleys between formants , Both filter sections have poles and zeros , Unlike earlier postfilters that often introduced a substantial amount of muffling to the output speech , our postfilter significantly reduces this effect by minimizing the spectral tilt in its frequency response , As a result , this postfilter achieves noticeable noise reduction while introducing only minimal distortion in speech , The complexity of the postfilter is quite low . Variations of this postfilter are now being used in several national and international speech coding standards , This paper presents for the first time a complete description of our original postfiltering algorithm and the underlying ideas that motivated its development .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (automatic gain control) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
ADAPTIVE POSTFILTERING FOR QUALITY ENHANCEMENT OF CODED SPEECH . An adaptive postfiltering algorithm for enhancing the perceptual quality of coded speech is presented , The postfilter consists of a long-term postfilter section in cascade with a shortterm postfilter section and includes spectral tilt compensation and automatic gain control (communication link) , The long-term section emphasizes pitch harmonics and attenuates the spectral valleys between pitch harmonics , The short-term section , on the other hand , emphasizes speech formants and attenuates the spectral valleys between formants , Both filter sections have poles and zeros , Unlike earlier postfilters that often introduced a substantial amount of muffling to the output speech , our postfilter significantly reduces this effect by minimizing the spectral tilt in its frequency response , As a result , this postfilter achieves noticeable noise reduction while introducing only minimal distortion in speech , The complexity of the postfilter is quite low . Variations of this postfilter are now being used in several national and international speech coding standards , This paper presents for the first time a complete description of our original postfiltering algorithm and the underlying ideas that motivated its development .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (automatic gain control) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
ADAPTIVE POSTFILTERING FOR QUALITY ENHANCEMENT OF CODED SPEECH . An adaptive postfiltering algorithm for enhancing the perceptual quality of coded speech is presented , The postfilter consists of a long-term postfilter section in cascade with a shortterm postfilter section and includes spectral tilt compensation and automatic gain control (communication link) , The long-term section emphasizes pitch harmonics and attenuates the spectral valleys between pitch harmonics , The short-term section , on the other hand , emphasizes speech formants and attenuates the spectral valleys between formants , Both filter sections have poles and zeros , Unlike earlier postfilters that often introduced a substantial amount of muffling to the output speech , our postfilter significantly reduces this effect by minimizing the spectral tilt in its frequency response , As a result , this postfilter achieves noticeable noise reduction while introducing only minimal distortion in speech , The complexity of the postfilter is quite low . Variations of this postfilter are now being used in several national and international speech coding standards , This paper presents for the first time a complete description of our original postfiltering algorithm and the underlying ideas that motivated its development .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (automatic gain control) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
ADAPTIVE POSTFILTERING FOR QUALITY ENHANCEMENT OF CODED SPEECH . An adaptive postfiltering algorithm for enhancing the perceptual quality of coded speech is presented , The postfilter consists of a long-term postfilter section in cascade with a shortterm postfilter section and includes spectral tilt compensation and automatic gain control (communication link) , The long-term section emphasizes pitch harmonics and attenuates the spectral valleys between pitch harmonics , The short-term section , on the other hand , emphasizes speech formants and attenuates the spectral valleys between formants , Both filter sections have poles and zeros , Unlike earlier postfilters that often introduced a substantial amount of muffling to the output speech , our postfilter significantly reduces this effect by minimizing the spectral tilt in its frequency response , As a result , this postfilter achieves noticeable noise reduction while introducing only minimal distortion in speech , The complexity of the postfilter is quite low . Variations of this postfilter are now being used in several national and international speech coding standards , This paper presents for the first time a complete description of our original postfiltering algorithm and the underlying ideas that motivated its development .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (automatic gain control) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
ADAPTIVE POSTFILTERING FOR QUALITY ENHANCEMENT OF CODED SPEECH . An adaptive postfiltering algorithm for enhancing the perceptual quality of coded speech is presented , The postfilter consists of a long-term postfilter section in cascade with a shortterm postfilter section and includes spectral tilt compensation and automatic gain control (communication link) , The long-term section emphasizes pitch harmonics and attenuates the spectral valleys between pitch harmonics , The short-term section , on the other hand , emphasizes speech formants and attenuates the spectral valleys between formants , Both filter sections have poles and zeros , Unlike earlier postfilters that often introduced a substantial amount of muffling to the output speech , our postfilter significantly reduces this effect by minimizing the spectral tilt in its frequency response , As a result , this postfilter achieves noticeable noise reduction while introducing only minimal distortion in speech , The complexity of the postfilter is quite low . Variations of this postfilter are now being used in several national and international speech coding standards , This paper presents for the first time a complete description of our original postfiltering algorithm and the underlying ideas that motivated its development .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (automatic gain control) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
ADAPTIVE POSTFILTERING FOR QUALITY ENHANCEMENT OF CODED SPEECH . An adaptive postfiltering algorithm for enhancing the perceptual quality of coded speech is presented , The postfilter consists of a long-term postfilter section in cascade with a shortterm postfilter section and includes spectral tilt compensation and automatic gain control (communication link) , The long-term section emphasizes pitch harmonics and attenuates the spectral valleys between pitch harmonics , The short-term section , on the other hand , emphasizes speech formants and attenuates the spectral valleys between formants , Both filter sections have poles and zeros , Unlike earlier postfilters that often introduced a substantial amount of muffling to the output speech , our postfilter significantly reduces this effect by minimizing the spectral tilt in its frequency response , As a result , this postfilter achieves noticeable noise reduction while introducing only minimal distortion in speech , The complexity of the postfilter is quite low . Variations of this postfilter are now being used in several national and international speech coding standards , This paper presents for the first time a complete description of our original postfiltering algorithm and the underlying ideas that motivated its development .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING. 3 (1): 94-98 JAN 1995

Publication Year: 1995

SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING

National Tsing Hua University (NTHU) Hsinchu, Taiwan

Kuo, Jean, Wang
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
SPEECH CLASSIFICATION EMBEDDED IN ADAPTIVE CODEBOOK SEARCH FOR LOW BIT-RATE CELP CODING . This correspondence proposes a new CELP coding method which embeds speech classification in adaptive codebook (sound signal, speech signal) search . This approach can retain the synthesized speech quality at bit-rates below 4 kb/s . A pitch analyzer is designed to classify each frame by its periodicity , and with a finite-state machine , one of four states is determined . Then the adaptive codebook search scheme is switched according to the state . Simulation results show that higher SEGSNR and lower computation complexity can be achieved , and the pitch contour of the synthesized speech is smoother than that produced by conventional CELP coders .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING. 3 (4): 242-250 JUL 1995

Publication Year: 1995

A MIXED EXCITATION LPC VOCODER MODEL FOR LOW BIT-RATE SPEECH CODING

Georgia Institute of Technology

Mccree, Barnwell
US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
A MIXED EXCITATION LPC VOCODER MODEL FOR LOW BIT-RATE SPEECH CODING . Traditional pitch-excited linear predictive coding (LPC) vocoders use a fully parametric model to efficiently encode the important information in human speech . These vocoders can produce intelligible speech at low data rates (800-2400 b/s) , but they often sound synthetic and generate annoying artifacts such as buzzes , thumps , and tonal noises . These problems increase dramatically if acoustic background noise (LP filter) is present at the speech input . This paper presents a new mixed excitation LPC vocoder model that preserves the low bit rate of a fully parametric model but adds more free parameters to the excitation signal so that the synthesizer can mimic more characteristics of natural human speech . The new model also eliminates the traditional requirement for a binary voicing decision so that the vocoder performs well even in the presence of acoustic background noise . A 2400-b/s LPC vocoder based on this model has been developed and implemented in simulations and in a real-time system . Formal subjective testing of this coder confirms that it produces natural sounding speech even in a difficult noise environment . In fact , diagnostic acceptibility measure (DAM) test scores show that the performance of the 2400-b/s mixed excitation LPC vocoder is close to that of the government standard 4800-b/s CELP coder .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (background noise) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
A MIXED EXCITATION LPC VOCODER MODEL FOR LOW BIT-RATE SPEECH CODING . Traditional pitch-excited linear predictive coding (LPC) vocoders use a fully parametric model to efficiently encode the important information in human speech . These vocoders can produce intelligible speech at low data rates (800-2400 b/s) , but they often sound synthetic and generate annoying artifacts such as buzzes , thumps , and tonal noises . These problems increase dramatically if acoustic background noise (LP filter) is present at the speech input . This paper presents a new mixed excitation LPC vocoder model that preserves the low bit rate of a fully parametric model but adds more free parameters to the excitation signal so that the synthesizer can mimic more characteristics of natural human speech . The new model also eliminates the traditional requirement for a binary voicing decision so that the vocoder performs well even in the presence of acoustic background noise . A 2400-b/s LPC vocoder based on this model has been developed and implemented in simulations and in a real-time system . Formal subjective testing of this coder confirms that it produces natural sounding speech even in a difficult noise environment . In fact , diagnostic acceptibility measure (DAM) test scores show that the performance of the 2400-b/s mixed excitation LPC vocoder is close to that of the government standard 4800-b/s CELP coder .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
A MIXED EXCITATION LPC VOCODER MODEL FOR LOW BIT-RATE SPEECH CODING . Traditional pitch-excited linear predictive coding (LPC) vocoders use a fully parametric model to efficiently encode the important information in human speech . These vocoders can produce intelligible speech at low data rates (800-2400 b/s) , but they often sound synthetic and generate annoying artifacts such as buzzes , thumps , and tonal noises . These problems increase dramatically if acoustic background noise (LP filter) is present at the speech input . This paper presents a new mixed excitation LPC vocoder model that preserves the low bit rate of a fully parametric model but adds more free parameters to the excitation signal so that the synthesizer can mimic more characteristics of natural human speech . The new model also eliminates the traditional requirement for a binary voicing decision so that the vocoder performs well even in the presence of acoustic background noise . A 2400-b/s LPC vocoder based on this model has been developed and implemented in simulations and in a real-time system . Formal subjective testing of this coder confirms that it produces natural sounding speech even in a difficult noise environment . In fact , diagnostic acceptibility measure (DAM) test scores show that the performance of the 2400-b/s mixed excitation LPC vocoder is close to that of the government standard 4800-b/s CELP coder .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
A MIXED EXCITATION LPC VOCODER MODEL FOR LOW BIT-RATE SPEECH CODING . Traditional pitch-excited linear predictive coding (LPC) vocoders use a fully parametric model to efficiently encode the important information in human speech . These vocoders can produce intelligible speech at low data rates (800-2400 b/s) , but they often sound synthetic and generate annoying artifacts such as buzzes , thumps , and tonal noises . These problems increase dramatically if acoustic background noise (LP filter) is present at the speech input . This paper presents a new mixed excitation LPC vocoder model that preserves the low bit rate of a fully parametric model but adds more free parameters to the excitation signal so that the synthesizer can mimic more characteristics of natural human speech . The new model also eliminates the traditional requirement for a binary voicing decision so that the vocoder performs well even in the presence of acoustic background noise . A 2400-b/s LPC vocoder based on this model has been developed and implemented in simulations and in a real-time system . Formal subjective testing of this coder confirms that it produces natural sounding speech even in a difficult noise environment . In fact , diagnostic acceptibility measure (DAM) test scores show that the performance of the 2400-b/s mixed excitation LPC vocoder is close to that of the government standard 4800-b/s CELP coder .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (background noise) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
A MIXED EXCITATION LPC VOCODER MODEL FOR LOW BIT-RATE SPEECH CODING . Traditional pitch-excited linear predictive coding (LPC) vocoders use a fully parametric model to efficiently encode the important information in human speech . These vocoders can produce intelligible speech at low data rates (800-2400 b/s) , but they often sound synthetic and generate annoying artifacts such as buzzes , thumps , and tonal noises . These problems increase dramatically if acoustic background noise (LP filter) is present at the speech input . This paper presents a new mixed excitation LPC vocoder model that preserves the low bit rate of a fully parametric model but adds more free parameters to the excitation signal so that the synthesizer can mimic more characteristics of natural human speech . The new model also eliminates the traditional requirement for a binary voicing decision so that the vocoder performs well even in the presence of acoustic background noise . A 2400-b/s LPC vocoder based on this model has been developed and implemented in simulations and in a real-time system . Formal subjective testing of this coder confirms that it produces natural sounding speech even in a difficult noise environment . In fact , diagnostic acceptibility measure (DAM) test scores show that the performance of the 2400-b/s mixed excitation LPC vocoder is close to that of the government standard 4800-b/s CELP coder .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
A MIXED EXCITATION LPC VOCODER MODEL FOR LOW BIT-RATE SPEECH CODING . Traditional pitch-excited linear predictive coding (LPC) vocoders use a fully parametric model to efficiently encode the important information in human speech . These vocoders can produce intelligible speech at low data rates (800-2400 b/s) , but they often sound synthetic and generate annoying artifacts such as buzzes , thumps , and tonal noises . These problems increase dramatically if acoustic background noise (LP filter) is present at the speech input . This paper presents a new mixed excitation LPC vocoder model that preserves the low bit rate of a fully parametric model but adds more free parameters to the excitation signal so that the synthesizer can mimic more characteristics of natural human speech . The new model also eliminates the traditional requirement for a binary voicing decision so that the vocoder performs well even in the presence of acoustic background noise . A 2400-b/s LPC vocoder based on this model has been developed and implemented in simulations and in a real-time system . Formal subjective testing of this coder confirms that it produces natural sounding speech even in a difficult noise environment . In fact , diagnostic acceptibility measure (DAM) test scores show that the performance of the 2400-b/s mixed excitation LPC vocoder is close to that of the government standard 4800-b/s CELP coder .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
IEEE NETWORK. 12 (5): 40-48 SEP-OCT 1998

Publication Year: 1998

A Survey Of Packet Loss Recovery Techniques For Streaming Audio

University College London (UCL)

Perkins, Hodson, Hardman
US7693710B2
CLAIM 1
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment, packet loss) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (concealing frame erasure) (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment, packet loss) and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (concealing frame erasure) (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment, packet loss) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (concealing frame erasure) (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment, packet loss) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (concealing frame erasure) (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment, packet loss) and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (concealing frame erasure) (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment (error concealment, packet loss) and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment, packet loss) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (concealing frame erasure) (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (concealing frame erasure) (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (concealing frame erasure) (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment (error concealment, packet loss) and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 13
. A device for conducting concealment (error concealment, packet loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment (error concealment, packet loss) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 14
. A device for conducting concealment (error concealment, packet loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment, packet loss) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 15
. A device for conducting concealment (error concealment, packet loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment, packet loss) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 16
. A device for conducting concealment (error concealment, packet loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment, packet loss) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 17
. A device for conducting concealment (error concealment, packet loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment, packet loss) and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment (error concealment, packet loss) and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 20
. A device for conducting concealment (error concealment, packet loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment, packet loss) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 22
. A device for conducting concealment (error concealment, packet loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 23
. A device for conducting concealment (error concealment, packet loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 24
. A device for conducting concealment (error concealment, packet loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (error concealment, packet loss) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment (error concealment, packet loss) and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
A Survey Of Packet Loss Recovery Techniques For Streaming Audio . We survey a number of packet loss (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) recovery techniques for streaming audio applications operating using IP multicast . We begin with a discussion of the loss and delay characteristics of an IP multicast channel , and from this show the need for packet loss recovery . Recovery techniques may be divided into two classes : sender- and receiver-based . We compare and contrast several sender-based recovery schemes : forward error correction (both media-specific and media-independent) , interleaving , and retransmission . In addition , a number of error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) schemes are discussed . We conclude with a series of recommendations for repair schemes to be used based on application requirements and network conditions .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V. : 1331-1334 1997

Publication Year: 1997

Construction And Evaluation Of A Robust Multifeature Speech/music Discriminator

Interval Research Corporation, Palo Alto, CA, USA

Scheirer, Slaney, Ieee Comp Soc
US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
Construction And Evaluation Of A Robust Multifeature Speech/music Discriminator . We report on the construction of a real-time computer system capable of distinguishing speech signal (speech signal, decoder determines concealment) s from music signals over a wide range of digital audio input We have examined 13 features intended to measure conceptually distinct properties of speech and/or music signals , and combined them in several multidimensional classification frameworks . We provide extensive data on system performance and the cross-validated training/test setup used to evaluate the system . For the datasets currently in use , the best classifier classifies with 5 . 8% error on a frame-by-frame basis , and 1 . 4% error when integrating long (2 . 4 second) segments of sound .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
Construction And Evaluation Of A Robust Multifeature Speech/music Discriminator . We report on the construction of a real-time computer system capable of distinguishing speech signal (speech signal, decoder determines concealment) s from music signals over a wide range of digital audio input We have examined 13 features intended to measure conceptually distinct properties of speech and/or music signals , and combined them in several multidimensional classification frameworks . We provide extensive data on system performance and the cross-validated training/test setup used to evaluate the system . For the datasets currently in use , the best classifier classifies with 5 . 8% error on a frame-by-frame basis , and 1 . 4% error when integrating long (2 . 4 second) segments of sound .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
Construction And Evaluation Of A Robust Multifeature Speech/music Discriminator . We report on the construction of a real-time computer system capable of distinguishing speech signal (speech signal, decoder determines concealment) s from music signals over a wide range of digital audio input We have examined 13 features intended to measure conceptually distinct properties of speech and/or music signals , and combined them in several multidimensional classification frameworks . We provide extensive data on system performance and the cross-validated training/test setup used to evaluate the system . For the datasets currently in use , the best classifier classifies with 5 . 8% error on a frame-by-frame basis , and 1 . 4% error when integrating long (2 . 4 second) segments of sound .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (when i) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
Construction And Evaluation Of A Robust Multifeature Speech/music Discriminator . We report on the construction of a real-time computer system capable of distinguishing speech signals from music signals over a wide range of digital audio input We have examined 13 features intended to measure conceptually distinct properties of speech and/or music signals , and combined them in several multidimensional classification frameworks . We provide extensive data on system performance and the cross-validated training/test setup used to evaluate the system . For the datasets currently in use , the best classifier classifies with 5 . 8% error on a frame-by-frame basis , and 1 . 4% error when i (LP filter excitation signal) ntegrating long (2 . 4 second) segments of sound .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (when i) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
Construction And Evaluation Of A Robust Multifeature Speech/music Discriminator . We report on the construction of a real-time computer system capable of distinguishing speech signals from music signals over a wide range of digital audio input We have examined 13 features intended to measure conceptually distinct properties of speech and/or music signals , and combined them in several multidimensional classification frameworks . We provide extensive data on system performance and the cross-validated training/test setup used to evaluate the system . For the datasets currently in use , the best classifier classifies with 5 . 8% error on a frame-by-frame basis , and 1 . 4% error when i (LP filter excitation signal) ntegrating long (2 . 4 second) segments of sound .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (when i) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
Construction And Evaluation Of A Robust Multifeature Speech/music Discriminator . We report on the construction of a real-time computer system capable of distinguishing speech signals from music signals over a wide range of digital audio input We have examined 13 features intended to measure conceptually distinct properties of speech and/or music signals , and combined them in several multidimensional classification frameworks . We provide extensive data on system performance and the cross-validated training/test setup used to evaluate the system . For the datasets currently in use , the best classifier classifies with 5 . 8% error on a frame-by-frame basis , and 1 . 4% error when i (LP filter excitation signal) ntegrating long (2 . 4 second) segments of sound .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
Construction And Evaluation Of A Robust Multifeature Speech/music Discriminator . We report on the construction of a real-time computer system capable of distinguishing speech signal (speech signal, decoder determines concealment) s from music signals over a wide range of digital audio input We have examined 13 features intended to measure conceptually distinct properties of speech and/or music signals , and combined them in several multidimensional classification frameworks . We provide extensive data on system performance and the cross-validated training/test setup used to evaluate the system . For the datasets currently in use , the best classifier classifies with 5 . 8% error on a frame-by-frame basis , and 1 . 4% error when integrating long (2 . 4 second) segments of sound .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
Construction And Evaluation Of A Robust Multifeature Speech/music Discriminator . We report on the construction of a real-time computer system capable of distinguishing speech signal (speech signal, decoder determines concealment) s from music signals over a wide range of digital audio input We have examined 13 features intended to measure conceptually distinct properties of speech and/or music signals , and combined them in several multidimensional classification frameworks . We provide extensive data on system performance and the cross-validated training/test setup used to evaluate the system . For the datasets currently in use , the best classifier classifies with 5 . 8% error on a frame-by-frame basis , and 1 . 4% error when integrating long (2 . 4 second) segments of sound .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
Construction And Evaluation Of A Robust Multifeature Speech/music Discriminator . We report on the construction of a real-time computer system capable of distinguishing speech signal (speech signal, decoder determines concealment) s from music signals over a wide range of digital audio input We have examined 13 features intended to measure conceptually distinct properties of speech and/or music signals , and combined them in several multidimensional classification frameworks . We provide extensive data on system performance and the cross-validated training/test setup used to evaluate the system . For the datasets currently in use , the best classifier classifies with 5 . 8% error on a frame-by-frame basis , and 1 . 4% error when integrating long (2 . 4 second) segments of sound .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (when i) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
Construction And Evaluation Of A Robust Multifeature Speech/music Discriminator . We report on the construction of a real-time computer system capable of distinguishing speech signals from music signals over a wide range of digital audio input We have examined 13 features intended to measure conceptually distinct properties of speech and/or music signals , and combined them in several multidimensional classification frameworks . We provide extensive data on system performance and the cross-validated training/test setup used to evaluate the system . For the datasets currently in use , the best classifier classifies with 5 . 8% error on a frame-by-frame basis , and 1 . 4% error when i (LP filter excitation signal) ntegrating long (2 . 4 second) segments of sound .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (when i) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
Construction And Evaluation Of A Robust Multifeature Speech/music Discriminator . We report on the construction of a real-time computer system capable of distinguishing speech signals from music signals over a wide range of digital audio input We have examined 13 features intended to measure conceptually distinct properties of speech and/or music signals , and combined them in several multidimensional classification frameworks . We provide extensive data on system performance and the cross-validated training/test setup used to evaluate the system . For the datasets currently in use , the best classifier classifies with 5 . 8% error on a frame-by-frame basis , and 1 . 4% error when i (LP filter excitation signal) ntegrating long (2 . 4 second) segments of sound .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
Construction And Evaluation Of A Robust Multifeature Speech/music Discriminator . We report on the construction of a real-time computer system capable of distinguishing speech signal (speech signal, decoder determines concealment) s from music signals over a wide range of digital audio input We have examined 13 features intended to measure conceptually distinct properties of speech and/or music signals , and combined them in several multidimensional classification frameworks . We provide extensive data on system performance and the cross-validated training/test setup used to evaluate the system . For the datasets currently in use , the best classifier classifies with 5 . 8% error on a frame-by-frame basis , and 1 . 4% error when integrating long (2 . 4 second) segments of sound .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (when i) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
Construction And Evaluation Of A Robust Multifeature Speech/music Discriminator . We report on the construction of a real-time computer system capable of distinguishing speech signals from music signals over a wide range of digital audio input We have examined 13 features intended to measure conceptually distinct properties of speech and/or music signals , and combined them in several multidimensional classification frameworks . We provide extensive data on system performance and the cross-validated training/test setup used to evaluate the system . For the datasets currently in use , the best classifier classifies with 5 . 8% error on a frame-by-frame basis , and 1 . 4% error when i (LP filter excitation signal) ntegrating long (2 . 4 second) segments of sound .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
SPEECH COMMUNICATION. 12 (2): 193-204 JUN 1993

Publication Year: 1993

LOW-DELAY SPEECH CODING

University of California, Santa Barbara, Simon Fraser University

Cuperman, Gersho
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (transmission error) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission errors is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch p (E q) rediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook . Finally , robustness to transmission errors is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission errors is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission errors is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch p (E q) rediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (transmission error) , limits to a given value a gain used for scaling the synthesized sound signal .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission errors is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch p (E q) rediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook . Finally , robustness to transmission errors is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission errors is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission errors is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch prediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission errors is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (transmission error) and decoder recovery (transmission error) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
LOW-DELAY SPEECH CODING . High quality low-delay speech coding at 8-16 kbit/s can be obtained with backward adaptive analysis-by-synthesis algorithms , such as Low-Delay CELP (the new CCITT 16 kbit/s standard) , Low-Delay Vector Excitation Coding (LD-VXC) , and backward adaptive tree/trellis codecs . The paper examines and reviews some of the basic techniques underlying low delay coding algorithms and presents design and performance trade-offs for low-delay analysis-by-synthesis codecs at rates of 8-16 kbit/s . A number of approaches for improving the speech quality at 8 kbit/s are discussed . Backward pitch p (E q) rediction is compared to a closed-loop forward configuration similar to that used in conventional CELP coders for the adaptive codebook (sound signal, speech signal) . Finally , robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s is discussed and a number of trade-offs for reducing transmission error sensitivity are presented .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
2002 IEEE SPEECH CODING WORKSHOP PROCEEDINGS. : 62-64 2002

Publication Year: 2002

The Forward-backward Recovery Sub-codec (FB-RSC) Method: A Robust Form Of Packet-loss Concealment For Use In Broadband IP Networks

The Nippon Telegraph and Telephone Corporation (日本電信電話株式会社, Nippon Denshin Denwa Kabushiki-gaisha, NTT)

Morinaga, Mano, Kaneko, Ieee, Ieee
US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (synthesized signal) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
The Forward-backward Recovery Sub-codec (FB-RSC) Method : A Robust Form Of Packet-loss Concealment For Use In Broadband IP Networks . Speech-coding according to the forward-backward recovery sub-codec (FB-RSC) method is described . The method provides improved speech quality under packet-loss conditions . In our system , each packet has a maximum of three codes , a main code , a forward sub-code , and a backward sub-code . The main code represents the current frame . The forward and backward sub-codes represent the next and previous frames . The necessity of the sub-codecs is determined by a sub-codec selector , which considers the SNR of the original and synthesized signal (LP filter excitation signal) s in this determination . A relatively low-compression and high-quality form of coding is used in the main codec , while coding with greater compression is used in the sub-codecs . We examined the quality of the proposed method for random loss of individual packets and pairs of consecutive packets . The result shows that our method significantly improves the concealment of packet loss .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (synthesized signal) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frame, packet loss) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
The Forward-backward Recovery Sub-codec (FB-RSC) Method : A Robust Form Of Packet-loss Concealment For Use In Broadband IP Networks . Speech-coding according to the forward-backward recovery sub-codec (FB-RSC) method is described . The method provides improved speech quality under packet-loss conditions . In our system , each packet has a maximum of three codes , a main code , a forward sub-code , and a backward sub-code . The main code represents the current frame (current frame, decoder determines concealment, decoder concealment) . The forward and backward sub-codes represent the next and previous frames . The necessity of the sub-codecs is determined by a sub-codec selector , which considers the SNR of the original and synthesized signal (LP filter excitation signal) s in this determination . A relatively low-compression and high-quality form of coding is used in the main codec , while coding with greater compression is used in the sub-codecs . We examined the quality of the proposed method for random loss of individual packets and pairs of consecutive packets . The result shows that our method significantly improves the concealment of packet loss (current frame, decoder determines concealment, decoder concealment) .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (synthesized signal) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frame, packet loss) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
The Forward-backward Recovery Sub-codec (FB-RSC) Method : A Robust Form Of Packet-loss Concealment For Use In Broadband IP Networks . Speech-coding according to the forward-backward recovery sub-codec (FB-RSC) method is described . The method provides improved speech quality under packet-loss conditions . In our system , each packet has a maximum of three codes , a main code , a forward sub-code , and a backward sub-code . The main code represents the current frame (current frame, decoder determines concealment, decoder concealment) . The forward and backward sub-codes represent the next and previous frames . The necessity of the sub-codecs is determined by a sub-codec selector , which considers the SNR of the original and synthesized signal (LP filter excitation signal) s in this determination . A relatively low-compression and high-quality form of coding is used in the main codec , while coding with greater compression is used in the sub-codecs . We examined the quality of the proposed method for random loss of individual packets and pairs of consecutive packets . The result shows that our method significantly improves the concealment of packet loss (current frame, decoder determines concealment, decoder concealment) .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (synthesized signal) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
The Forward-backward Recovery Sub-codec (FB-RSC) Method : A Robust Form Of Packet-loss Concealment For Use In Broadband IP Networks . Speech-coding according to the forward-backward recovery sub-codec (FB-RSC) method is described . The method provides improved speech quality under packet-loss conditions . In our system , each packet has a maximum of three codes , a main code , a forward sub-code , and a backward sub-code . The main code represents the current frame . The forward and backward sub-codes represent the next and previous frames . The necessity of the sub-codecs is determined by a sub-codec selector , which considers the SNR of the original and synthesized signal (LP filter excitation signal) s in this determination . A relatively low-compression and high-quality form of coding is used in the main codec , while coding with greater compression is used in the sub-codecs . We examined the quality of the proposed method for random loss of individual packets and pairs of consecutive packets . The result shows that our method significantly improves the concealment of packet loss .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (synthesized signal) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frame, packet loss) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
The Forward-backward Recovery Sub-codec (FB-RSC) Method : A Robust Form Of Packet-loss Concealment For Use In Broadband IP Networks . Speech-coding according to the forward-backward recovery sub-codec (FB-RSC) method is described . The method provides improved speech quality under packet-loss conditions . In our system , each packet has a maximum of three codes , a main code , a forward sub-code , and a backward sub-code . The main code represents the current frame (current frame, decoder determines concealment, decoder concealment) . The forward and backward sub-codes represent the next and previous frames . The necessity of the sub-codecs is determined by a sub-codec selector , which considers the SNR of the original and synthesized signal (LP filter excitation signal) s in this determination . A relatively low-compression and high-quality form of coding is used in the main codec , while coding with greater compression is used in the sub-codecs . We examined the quality of the proposed method for random loss of individual packets and pairs of consecutive packets . The result shows that our method significantly improves the concealment of packet loss (current frame, decoder determines concealment, decoder concealment) .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (previous frames) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (synthesized signal) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frame, packet loss) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
The Forward-backward Recovery Sub-codec (FB-RSC) Method : A Robust Form Of Packet-loss Concealment For Use In Broadband IP Networks . Speech-coding according to the forward-backward recovery sub-codec (FB-RSC) method is described . The method provides improved speech quality under packet-loss conditions . In our system , each packet has a maximum of three codes , a main code , a forward sub-code , and a backward sub-code . The main code represents the current frame (current frame, decoder determines concealment, decoder concealment) . The forward and backward sub-codes represent the next and previous frames (frame concealment) . The necessity of the sub-codecs is determined by a sub-codec selector , which considers the SNR of the original and synthesized signal (LP filter excitation signal) s in this determination . A relatively low-compression and high-quality form of coding is used in the main codec , while coding with greater compression is used in the sub-codecs . We examined the quality of the proposed method for random loss of individual packets and pairs of consecutive packets . The result shows that our method significantly improves the concealment of packet loss (current frame, decoder determines concealment, decoder concealment) .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
2002 IEEE SPEECH CODING WORKSHOP PROCEEDINGS. : 47-49 2002

Publication Year: 2002

Speech Coding Using Motion Picture Compression Techniques

Technische Universität Graz (TU Graz), Austria

Feldbauer, Kubin, Ieee, Ieee
US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (audio data) per sample for other frames .
Speech Coding Using Motion Picture Compression Techniques . Motion picture compression standards are well developed and widely used in multimedia applications . It could be very convenient to have only one coding algorithm implemented on a multimedia device which deals with video as well as audio and speech data . In this paper , we present a new approach to speech coding where we apply motion picture compression techniques to speech signal (speech signal, decoder determines concealment) s . An audio-to-video converter is proposed which prepares the speech or audio data (average energy) to allow direct processing by a video coding algorithm (e . g . MPEG) and a video-to-audio converter which reconstructs the speech signal .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
Speech Coding Using Motion Picture Compression Techniques . Motion picture compression standards are well developed and widely used in multimedia applications . It could be very convenient to have only one coding algorithm implemented on a multimedia device which deals with video as well as audio and speech data . In this paper , we present a new approach to speech coding where we apply motion picture compression techniques to speech signal (speech signal, decoder determines concealment) s . An audio-to-video converter is proposed which prepares the speech or audio data to allow direct processing by a video coding algorithm (e . g . MPEG) and a video-to-audio converter which reconstructs the speech signal .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
Speech Coding Using Motion Picture Compression Techniques . Motion picture compression standards are well developed and widely used in multimedia applications . It could be very convenient to have only one coding algorithm implemented on a multimedia device which deals with video as well as audio and speech data . In this paper , we present a new approach to speech coding where we apply motion picture compression techniques to speech signal (speech signal, decoder determines concealment) s . An audio-to-video converter is proposed which prepares the speech or audio data to allow direct processing by a video coding algorithm (e . g . MPEG) and a video-to-audio converter which reconstructs the speech signal .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (audio data) per sample for other frames .
Speech Coding Using Motion Picture Compression Techniques . Motion picture compression standards are well developed and widely used in multimedia applications . It could be very convenient to have only one coding algorithm implemented on a multimedia device which deals with video as well as audio and speech data . In this paper , we present a new approach to speech coding where we apply motion picture compression techniques to speech signal (speech signal, decoder determines concealment) s . An audio-to-video converter is proposed which prepares the speech or audio data (average energy) to allow direct processing by a video coding algorithm (e . g . MPEG) and a video-to-audio converter which reconstructs the speech signal .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
Speech Coding Using Motion Picture Compression Techniques . Motion picture compression standards are well developed and widely used in multimedia applications . It could be very convenient to have only one coding algorithm implemented on a multimedia device which deals with video as well as audio and speech data . In this paper , we present a new approach to speech coding where we apply motion picture compression techniques to speech signal (speech signal, decoder determines concealment) s . An audio-to-video converter is proposed which prepares the speech or audio data to allow direct processing by a video coding algorithm (e . g . MPEG) and a video-to-audio converter which reconstructs the speech signal .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
Speech Coding Using Motion Picture Compression Techniques . Motion picture compression standards are well developed and widely used in multimedia applications . It could be very convenient to have only one coding algorithm implemented on a multimedia device which deals with video as well as audio and speech data . In this paper , we present a new approach to speech coding where we apply motion picture compression techniques to speech signal (speech signal, decoder determines concealment) s . An audio-to-video converter is proposed which prepares the speech or audio data to allow direct processing by a video coding algorithm (e . g . MPEG) and a video-to-audio converter which reconstructs the speech signal .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (audio data) per sample for other frames .
Speech Coding Using Motion Picture Compression Techniques . Motion picture compression standards are well developed and widely used in multimedia applications . It could be very convenient to have only one coding algorithm implemented on a multimedia device which deals with video as well as audio and speech data . In this paper , we present a new approach to speech coding where we apply motion picture compression techniques to speech signal (speech signal, decoder determines concealment) s . An audio-to-video converter is proposed which prepares the speech or audio data (average energy) to allow direct processing by a video coding algorithm (e . g . MPEG) and a video-to-audio converter which reconstructs the speech signal .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI. : 1375-1378 2000

Publication Year: 2000

A 1200 Bps Speech Coder Based On MELP

Signalcom Inc

Wang, Koishida, Cuperman, Gersho, Collura, Ieee, Ieee, Ieee
US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (improving performance) per sample for other frames .
A 1200 Bps Speech Coder Based On MELP . This paper presents a 1 . 2 kbps speech coder based on the MELP analysis algorithm . In the proposed coder , the MELP parameters of three consecutive frames are grouped into a superframe and jointly quantized to obtain a high coding efficiency . The inter frame redundancy is exploited with distinct quantization schemes for different unvoiced/voiced (U/V) frame combinations in the superframe . Novel techniques for improving performance (average energy) make use of the superframe structure . These include pitch vector quantization using pitch differentials , joint quantization of pitch and U/V decisions and LSF quantization with a forward-backward interpolation method . Subjective test results indicate that the 1 . 2 kbps speech coder achieves approximately the same quality as the proposed federal standard 2 . 4 kbps MELP coder .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (improving performance) per sample for other frames .
A 1200 Bps Speech Coder Based On MELP . This paper presents a 1 . 2 kbps speech coder based on the MELP analysis algorithm . In the proposed coder , the MELP parameters of three consecutive frames are grouped into a superframe and jointly quantized to obtain a high coding efficiency . The inter frame redundancy is exploited with distinct quantization schemes for different unvoiced/voiced (U/V) frame combinations in the superframe . Novel techniques for improving performance (average energy) make use of the superframe structure . These include pitch vector quantization using pitch differentials , joint quantization of pitch and U/V decisions and LSF quantization with a forward-backward interpolation method . Subjective test results indicate that the 1 . 2 kbps speech coder achieves approximately the same quality as the proposed federal standard 2 . 4 kbps MELP coder .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (improving performance) per sample for other frames .
A 1200 Bps Speech Coder Based On MELP . This paper presents a 1 . 2 kbps speech coder based on the MELP analysis algorithm . In the proposed coder , the MELP parameters of three consecutive frames are grouped into a superframe and jointly quantized to obtain a high coding efficiency . The inter frame redundancy is exploited with distinct quantization schemes for different unvoiced/voiced (U/V) frame combinations in the superframe . Novel techniques for improving performance (average energy) make use of the superframe structure . These include pitch vector quantization using pitch differentials , joint quantization of pitch and U/V decisions and LSF quantization with a forward-backward interpolation method . Subjective test results indicate that the 1 . 2 kbps speech coder achieves approximately the same quality as the proposed federal standard 2 . 4 kbps MELP coder .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
2000 IEEE WORKSHOP ON SPEECH CODING, PROCEEDINGS. : 145-147 2000

Publication Year: 2000

An Adaptive Multi Rate Wideband Speech Codec With Adaptive Gain Re-quantization

University or Rheinisch-Westfälische Technische Hochschule Aachen (RWTH Aachen)

Erdmann, Vary, Fischer, Stegmann, Quinquis, Massaloux, Kovesi, Ieee, Ieee
US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (Linear Prediction, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
An Adaptive Multi Rate Wideband Speech Codec With Adaptive Gain Re-quantization . This paper describes an adaptive multi rate wideband (AMR-WB) speech codec proposed for the GSM system and also for the evolving Third Generation (3G) mobile speech services . The coder is a multi rate SB-CELP (Subband-Code-Excited Linear Prediction (decoder determines concealment, speech signal (decoder determines concealment, speech signal) ) ) with five modes operating at bit rates from 24kbit/s down to 9 . 1kbit/s . Our basic approach consists of an unequal bandsplitting of the input signal into two subbands (SB) . A variable rater multi-mode ACELP coder is applied to the lower subband (0-6kHz) . The various bit rates are integrated in a common structure where the scalability is realized by exchanging the fixed excitation codebooks while leaving all other codec parameters invariant . For the GSM related modes (9 . 1-17 . 8 kbit/s) , the upper subband (6-7 kHz) is coded using a very low bit rate representation based on bandwidth expansion techniques . In case of the 3G application (24kbit/s) the upper band is coded using a 4 kbit/s ADPCM coding scheme . In addition the analysis by synthesis (AbS) coder of the lower band employs a novel closed loop gain re-quantization technique controlled by the character of the speech signal . Thereby the codec achieves an enhanced performance for background noise while maintaining its clean speech quality .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (Linear Prediction, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
An Adaptive Multi Rate Wideband Speech Codec With Adaptive Gain Re-quantization . This paper describes an adaptive multi rate wideband (AMR-WB) speech codec proposed for the GSM system and also for the evolving Third Generation (3G) mobile speech services . The coder is a multi rate SB-CELP (Subband-Code-Excited Linear Prediction (decoder determines concealment, speech signal (decoder determines concealment, speech signal) ) ) with five modes operating at bit rates from 24kbit/s down to 9 . 1kbit/s . Our basic approach consists of an unequal bandsplitting of the input signal into two subbands (SB) . A variable rater multi-mode ACELP coder is applied to the lower subband (0-6kHz) . The various bit rates are integrated in a common structure where the scalability is realized by exchanging the fixed excitation codebooks while leaving all other codec parameters invariant . For the GSM related modes (9 . 1-17 . 8 kbit/s) , the upper subband (6-7 kHz) is coded using a very low bit rate representation based on bandwidth expansion techniques . In case of the 3G application (24kbit/s) the upper band is coded using a 4 kbit/s ADPCM coding scheme . In addition the analysis by synthesis (AbS) coder of the lower band employs a novel closed loop gain re-quantization technique controlled by the character of the speech signal . Thereby the codec achieves an enhanced performance for background noise while maintaining its clean speech quality .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (Linear Prediction, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
An Adaptive Multi Rate Wideband Speech Codec With Adaptive Gain Re-quantization . This paper describes an adaptive multi rate wideband (AMR-WB) speech codec proposed for the GSM system and also for the evolving Third Generation (3G) mobile speech services . The coder is a multi rate SB-CELP (Subband-Code-Excited Linear Prediction (decoder determines concealment, speech signal (decoder determines concealment, speech signal) ) ) with five modes operating at bit rates from 24kbit/s down to 9 . 1kbit/s . Our basic approach consists of an unequal bandsplitting of the input signal into two subbands (SB) . A variable rater multi-mode ACELP coder is applied to the lower subband (0-6kHz) . The various bit rates are integrated in a common structure where the scalability is realized by exchanging the fixed excitation codebooks while leaving all other codec parameters invariant . For the GSM related modes (9 . 1-17 . 8 kbit/s) , the upper subband (6-7 kHz) is coded using a very low bit rate representation based on bandwidth expansion techniques . In case of the 3G application (24kbit/s) the upper band is coded using a 4 kbit/s ADPCM coding scheme . In addition the analysis by synthesis (AbS) coder of the lower band employs a novel closed loop gain re-quantization technique controlled by the character of the speech signal . Thereby the codec achieves an enhanced performance for background noise while maintaining its clean speech quality .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (enhanced performance, background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
An Adaptive Multi Rate Wideband Speech Codec With Adaptive Gain Re-quantization . This paper describes an adaptive multi rate wideband (AMR-WB) speech codec proposed for the GSM system and also for the evolving Third Generation (3G) mobile speech services . The coder is a multi rate SB-CELP (Subband-Code-Excited Linear Prediction) with five modes operating at bit rates from 24kbit/s down to 9 . 1kbit/s . Our basic approach consists of an unequal bandsplitting of the input signal into two subbands (SB) . A variable rater multi-mode ACELP coder is applied to the lower subband (0-6kHz) . The various bit rates are integrated in a common structure where the scalability is realized by exchanging the fixed excitation codebooks while leaving all other codec parameters invariant . For the GSM related modes (9 . 1-17 . 8 kbit/s) , the upper subband (6-7 kHz) is coded using a very low bit rate representation based on bandwidth expansion techniques . In case of the 3G application (24kbit/s) the upper band is coded using a 4 kbit/s ADPCM coding scheme . In addition the analysis by synthesis (AbS) coder of the lower band employs a novel closed loop gain re-quantization technique controlled by the character of the speech signal . Thereby the codec achieves an enhanced performance (LP filter) for background noise (LP filter) while maintaining its clean speech quality .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (enhanced performance, background noise) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
An Adaptive Multi Rate Wideband Speech Codec With Adaptive Gain Re-quantization . This paper describes an adaptive multi rate wideband (AMR-WB) speech codec proposed for the GSM system and also for the evolving Third Generation (3G) mobile speech services . The coder is a multi rate SB-CELP (Subband-Code-Excited Linear Prediction) with five modes operating at bit rates from 24kbit/s down to 9 . 1kbit/s . Our basic approach consists of an unequal bandsplitting of the input signal into two subbands (SB) . A variable rater multi-mode ACELP coder is applied to the lower subband (0-6kHz) . The various bit rates are integrated in a common structure where the scalability is realized by exchanging the fixed excitation codebooks while leaving all other codec parameters invariant . For the GSM related modes (9 . 1-17 . 8 kbit/s) , the upper subband (6-7 kHz) is coded using a very low bit rate representation based on bandwidth expansion techniques . In case of the 3G application (24kbit/s) the upper band is coded using a 4 kbit/s ADPCM coding scheme . In addition the analysis by synthesis (AbS) coder of the lower band employs a novel closed loop gain re-quantization technique controlled by the character of the speech signal . Thereby the codec achieves an enhanced performance (LP filter) for background noise (LP filter) while maintaining its clean speech quality .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (enhanced performance, background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
An Adaptive Multi Rate Wideband Speech Codec With Adaptive Gain Re-quantization . This paper describes an adaptive multi rate wideband (AMR-WB) speech codec proposed for the GSM system and also for the evolving Third Generation (3G) mobile speech services . The coder is a multi rate SB-CELP (Subband-Code-Excited Linear Prediction) with five modes operating at bit rates from 24kbit/s down to 9 . 1kbit/s . Our basic approach consists of an unequal bandsplitting of the input signal into two subbands (SB) . A variable rater multi-mode ACELP coder is applied to the lower subband (0-6kHz) . The various bit rates are integrated in a common structure where the scalability is realized by exchanging the fixed excitation codebooks while leaving all other codec parameters invariant . For the GSM related modes (9 . 1-17 . 8 kbit/s) , the upper subband (6-7 kHz) is coded using a very low bit rate representation based on bandwidth expansion techniques . In case of the 3G application (24kbit/s) the upper band is coded using a 4 kbit/s ADPCM coding scheme . In addition the analysis by synthesis (AbS) coder of the lower band employs a novel closed loop gain re-quantization technique controlled by the character of the speech signal . Thereby the codec achieves an enhanced performance (LP filter) for background noise (LP filter) while maintaining its clean speech quality .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (Linear Prediction, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
An Adaptive Multi Rate Wideband Speech Codec With Adaptive Gain Re-quantization . This paper describes an adaptive multi rate wideband (AMR-WB) speech codec proposed for the GSM system and also for the evolving Third Generation (3G) mobile speech services . The coder is a multi rate SB-CELP (Subband-Code-Excited Linear Prediction (decoder determines concealment, speech signal (decoder determines concealment, speech signal) ) ) with five modes operating at bit rates from 24kbit/s down to 9 . 1kbit/s . Our basic approach consists of an unequal bandsplitting of the input signal into two subbands (SB) . A variable rater multi-mode ACELP coder is applied to the lower subband (0-6kHz) . The various bit rates are integrated in a common structure where the scalability is realized by exchanging the fixed excitation codebooks while leaving all other codec parameters invariant . For the GSM related modes (9 . 1-17 . 8 kbit/s) , the upper subband (6-7 kHz) is coded using a very low bit rate representation based on bandwidth expansion techniques . In case of the 3G application (24kbit/s) the upper band is coded using a 4 kbit/s ADPCM coding scheme . In addition the analysis by synthesis (AbS) coder of the lower band employs a novel closed loop gain re-quantization technique controlled by the character of the speech signal . Thereby the codec achieves an enhanced performance for background noise while maintaining its clean speech quality .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (Linear Prediction, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
An Adaptive Multi Rate Wideband Speech Codec With Adaptive Gain Re-quantization . This paper describes an adaptive multi rate wideband (AMR-WB) speech codec proposed for the GSM system and also for the evolving Third Generation (3G) mobile speech services . The coder is a multi rate SB-CELP (Subband-Code-Excited Linear Prediction (decoder determines concealment, speech signal (decoder determines concealment, speech signal) ) ) with five modes operating at bit rates from 24kbit/s down to 9 . 1kbit/s . Our basic approach consists of an unequal bandsplitting of the input signal into two subbands (SB) . A variable rater multi-mode ACELP coder is applied to the lower subband (0-6kHz) . The various bit rates are integrated in a common structure where the scalability is realized by exchanging the fixed excitation codebooks while leaving all other codec parameters invariant . For the GSM related modes (9 . 1-17 . 8 kbit/s) , the upper subband (6-7 kHz) is coded using a very low bit rate representation based on bandwidth expansion techniques . In case of the 3G application (24kbit/s) the upper band is coded using a 4 kbit/s ADPCM coding scheme . In addition the analysis by synthesis (AbS) coder of the lower band employs a novel closed loop gain re-quantization technique controlled by the character of the speech signal . Thereby the codec achieves an enhanced performance for background noise while maintaining its clean speech quality .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (Linear Prediction, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
An Adaptive Multi Rate Wideband Speech Codec With Adaptive Gain Re-quantization . This paper describes an adaptive multi rate wideband (AMR-WB) speech codec proposed for the GSM system and also for the evolving Third Generation (3G) mobile speech services . The coder is a multi rate SB-CELP (Subband-Code-Excited Linear Prediction (decoder determines concealment, speech signal (decoder determines concealment, speech signal) ) ) with five modes operating at bit rates from 24kbit/s down to 9 . 1kbit/s . Our basic approach consists of an unequal bandsplitting of the input signal into two subbands (SB) . A variable rater multi-mode ACELP coder is applied to the lower subband (0-6kHz) . The various bit rates are integrated in a common structure where the scalability is realized by exchanging the fixed excitation codebooks while leaving all other codec parameters invariant . For the GSM related modes (9 . 1-17 . 8 kbit/s) , the upper subband (6-7 kHz) is coded using a very low bit rate representation based on bandwidth expansion techniques . In case of the 3G application (24kbit/s) the upper band is coded using a 4 kbit/s ADPCM coding scheme . In addition the analysis by synthesis (AbS) coder of the lower band employs a novel closed loop gain re-quantization technique controlled by the character of the speech signal . Thereby the codec achieves an enhanced performance for background noise while maintaining its clean speech quality .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (enhanced performance, background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
An Adaptive Multi Rate Wideband Speech Codec With Adaptive Gain Re-quantization . This paper describes an adaptive multi rate wideband (AMR-WB) speech codec proposed for the GSM system and also for the evolving Third Generation (3G) mobile speech services . The coder is a multi rate SB-CELP (Subband-Code-Excited Linear Prediction) with five modes operating at bit rates from 24kbit/s down to 9 . 1kbit/s . Our basic approach consists of an unequal bandsplitting of the input signal into two subbands (SB) . A variable rater multi-mode ACELP coder is applied to the lower subband (0-6kHz) . The various bit rates are integrated in a common structure where the scalability is realized by exchanging the fixed excitation codebooks while leaving all other codec parameters invariant . For the GSM related modes (9 . 1-17 . 8 kbit/s) , the upper subband (6-7 kHz) is coded using a very low bit rate representation based on bandwidth expansion techniques . In case of the 3G application (24kbit/s) the upper band is coded using a 4 kbit/s ADPCM coding scheme . In addition the analysis by synthesis (AbS) coder of the lower band employs a novel closed loop gain re-quantization technique controlled by the character of the speech signal . Thereby the codec achieves an enhanced performance (LP filter) for background noise (LP filter) while maintaining its clean speech quality .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (enhanced performance, background noise) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
An Adaptive Multi Rate Wideband Speech Codec With Adaptive Gain Re-quantization . This paper describes an adaptive multi rate wideband (AMR-WB) speech codec proposed for the GSM system and also for the evolving Third Generation (3G) mobile speech services . The coder is a multi rate SB-CELP (Subband-Code-Excited Linear Prediction) with five modes operating at bit rates from 24kbit/s down to 9 . 1kbit/s . Our basic approach consists of an unequal bandsplitting of the input signal into two subbands (SB) . A variable rater multi-mode ACELP coder is applied to the lower subband (0-6kHz) . The various bit rates are integrated in a common structure where the scalability is realized by exchanging the fixed excitation codebooks while leaving all other codec parameters invariant . For the GSM related modes (9 . 1-17 . 8 kbit/s) , the upper subband (6-7 kHz) is coded using a very low bit rate representation based on bandwidth expansion techniques . In case of the 3G application (24kbit/s) the upper band is coded using a 4 kbit/s ADPCM coding scheme . In addition the analysis by synthesis (AbS) coder of the lower band employs a novel closed loop gain re-quantization technique controlled by the character of the speech signal . Thereby the codec achieves an enhanced performance (LP filter) for background noise (LP filter) while maintaining its clean speech quality .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (Linear Prediction, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
An Adaptive Multi Rate Wideband Speech Codec With Adaptive Gain Re-quantization . This paper describes an adaptive multi rate wideband (AMR-WB) speech codec proposed for the GSM system and also for the evolving Third Generation (3G) mobile speech services . The coder is a multi rate SB-CELP (Subband-Code-Excited Linear Prediction (decoder determines concealment, speech signal (decoder determines concealment, speech signal) ) ) with five modes operating at bit rates from 24kbit/s down to 9 . 1kbit/s . Our basic approach consists of an unequal bandsplitting of the input signal into two subbands (SB) . A variable rater multi-mode ACELP coder is applied to the lower subband (0-6kHz) . The various bit rates are integrated in a common structure where the scalability is realized by exchanging the fixed excitation codebooks while leaving all other codec parameters invariant . For the GSM related modes (9 . 1-17 . 8 kbit/s) , the upper subband (6-7 kHz) is coded using a very low bit rate representation based on bandwidth expansion techniques . In case of the 3G application (24kbit/s) the upper band is coded using a 4 kbit/s ADPCM coding scheme . In addition the analysis by synthesis (AbS) coder of the lower band employs a novel closed loop gain re-quantization technique controlled by the character of the speech signal . Thereby the codec achieves an enhanced performance for background noise while maintaining its clean speech quality .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (enhanced performance, background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
An Adaptive Multi Rate Wideband Speech Codec With Adaptive Gain Re-quantization . This paper describes an adaptive multi rate wideband (AMR-WB) speech codec proposed for the GSM system and also for the evolving Third Generation (3G) mobile speech services . The coder is a multi rate SB-CELP (Subband-Code-Excited Linear Prediction) with five modes operating at bit rates from 24kbit/s down to 9 . 1kbit/s . Our basic approach consists of an unequal bandsplitting of the input signal into two subbands (SB) . A variable rater multi-mode ACELP coder is applied to the lower subband (0-6kHz) . The various bit rates are integrated in a common structure where the scalability is realized by exchanging the fixed excitation codebooks while leaving all other codec parameters invariant . For the GSM related modes (9 . 1-17 . 8 kbit/s) , the upper subband (6-7 kHz) is coded using a very low bit rate representation based on bandwidth expansion techniques . In case of the 3G application (24kbit/s) the upper band is coded using a 4 kbit/s ADPCM coding scheme . In addition the analysis by synthesis (AbS) coder of the lower band employs a novel closed loop gain re-quantization technique controlled by the character of the speech signal . Thereby the codec achieves an enhanced performance (LP filter) for background noise (LP filter) while maintaining its clean speech quality .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
2000 IEEE WORKSHOP ON SPEECH CODING, PROCEEDINGS. : 126-128 2000

Publication Year: 2000

Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks

Southern Methodist University (SMU), Dallas, TX, USA

Wang, Gibson, Ieee, Ieee
US7693710B2
CLAIM 1
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (frame erasure) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (frame erasure) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (when frame) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (when frame) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (frame erasure) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (frame erasure) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (when frame) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (frame erasure) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (frame erasure) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (when frame) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (when frame) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (frame erasure) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (frame erasure) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (when frame) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
Performance Comparison Of Intraframe And Interframe LSF Quantization In Packet Networks . Line Spectrum Frequencies(LSF) have been the prevailing parameter set to represent LPC coefficients in speech coding . Extensive research has been performed to exploit their interframe and intraframe correlations and quantize them more efficiently . Interframe coding of LSF's can cause error propagation when frame erasure (frame erasure) s occur . Since most LSF quantizers were designed with the primary concerns of bit-rate and complexity , less attention was paid to error propagation . We investigate the erasure performance of interframe LSF coding and compare it with an intraframe coding method . Our results show that with only 5% extra bit-rate , intraframe coding is much more robust to frame erasures and a typical improvement of 0 . 5 dB on spectral distortion can be obtained with 20% packet loss . Subjective listening tests indicate significant improvement as well .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
ICASSP 99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI. : 5-8 1999

Publication Year: 1999

A 16, 24, 32 Kbit/s Wideband Speech Codec Based On ATCELP

Rheinisch-Westfälische Technische Hochschule Aachen (RWTH Aachen University)

Combescure, Schnitzler, Fischer, Kirchherr, Lamblin, Le Guyader, Massaloux, Quinquis, Stegmann, Vary, Ieee, Ieee
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (frame erasure concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (frame erasure concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (frame erasure concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (frame erasure concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (frame erasure concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment (frame erasure concealment) and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (frame erasure concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment (frame erasure concealment) and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment (frame erasure concealment) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (frame erasure concealment) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (frame erasure concealment) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (frame erasure concealment) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (frame erasure concealment) and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment (frame erasure concealment) and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (frame erasure concealment) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (CELP codec) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment (frame erasure concealment) and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
A 16 , 24 , 32 Kbit/s Wideband Speech Codec Based On ATCELP . This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited Linear Prediction (CELP) algorithm , called ATCELP , for the compression of wideband (7 kHz) signals . The CELP algorithm applies mainly to speech , whereas the ATC mode is selected for music and noise signals . We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment (frame erasure concealment) technique . Subjective listening tests have shown that the ATCELP codec (frame concealment) at bit rates of 16 , 24 and 32 kbit/s achieved performances close to those of the CCITT G . 722 at 48 , 56 and 64 kbit/s , respectively , at most operating conditions .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
ICASSP 99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI. : 197-200 1999

Publication Year: 1999

An Adaptive Post-filtering Technique Based On The Modified Yule-Walker Filter

COMSAT Lab., Clarksburg, MD, USA

Mustapha, Yeldener, Ieee, Ieee
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch (high p) value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
An Adaptive Post-filtering Technique Based On The Modified Yule-Walker Filter . This paper presents an adaptive time-domain post-filtering technique based on the modified Yule-Walker filter . Conventionally , post-filtering is derived from an original LPC spectrum [1] . In general , this time-domain technique produces unpredictable spectral tilt that is hard to control by the modified LPC synthesis , inverse and high p (average pitch, E q) ass filtering and causes unnecessary attenuation or amplification of some frequency components that introduces muffling in speech quality . This effect increases when voice coders are tandemed together . Another approach of designing a post-filter was developed by McAulay and Quatieri [2] which can only be used in sinusoidal based speech coders . We have also developed another new time-domain post-filtering technique . This technique eliminates the problem of spectral tilt in speech spectrum that can be applied to various speech coders . The new post-filter has a flat frequency response at the formant peaks of speech spectrum . Instead of looking at the modified LPC synthesis , inverse , and high pass filtering in the conventional time-domain technique , we gather information about the poles of the LPC spectrum in the new technique . This post-filtering technique has been used in a 4 kb/s Harmonic Excitation Linear Predictive Coder (HE-LPC) and a subjective listening tests have indicated that this technique outperforms the conventional one in both one and two tandem connections .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (flat frequency response, pass filter) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
An Adaptive Post-filtering Technique Based On The Modified Yule-Walker Filter . This paper presents an adaptive time-domain post-filtering technique based on the modified Yule-Walker filter . Conventionally , post-filtering is derived from an original LPC spectrum [1] . In general , this time-domain technique produces unpredictable spectral tilt that is hard to control by the modified LPC synthesis , inverse and high pass filter (pass filter, LP filter, LP filter excitation signal) ing and causes unnecessary attenuation or amplification of some frequency components that introduces muffling in speech quality . This effect increases when voice coders are tandemed together . Another approach of designing a post-filter was developed by McAulay and Quatieri [2] which can only be used in sinusoidal based speech coders . We have also developed another new time-domain post-filtering technique . This technique eliminates the problem of spectral tilt in speech spectrum that can be applied to various speech coders . The new post-filter has a flat frequency response (pass filter, LP filter, LP filter excitation signal) at the formant peaks of speech spectrum . Instead of looking at the modified LPC synthesis , inverse , and high pass filtering in the conventional time-domain technique , we gather information about the poles of the LPC spectrum in the new technique . This post-filtering technique has been used in a 4 kb/s Harmonic Excitation Linear Predictive Coder (HE-LPC) and a subjective listening tests have indicated that this technique outperforms the conventional one in both one and two tandem connections .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (flat frequency response, pass filter) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (high p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
An Adaptive Post-filtering Technique Based On The Modified Yule-Walker Filter . This paper presents an adaptive time-domain post-filtering technique based on the modified Yule-Walker filter . Conventionally , post-filtering is derived from an original LPC spectrum [1] . In general , this time-domain technique produces unpredictable spectral tilt that is hard to control by the modified LPC synthesis , inverse and high pass filter (pass filter, LP filter, LP filter excitation signal) ing and causes unnecessary attenuation or amplification of some frequency components that introduces muffling in speech quality . This effect increases when voice coders are tandemed together . Another approach of designing a post-filter was developed by McAulay and Quatieri [2] which can only be used in sinusoidal based speech coders . We have also developed another new time-domain post-filtering technique . This technique eliminates the problem of spectral tilt in speech spectrum that can be applied to various speech coders . The new post-filter has a flat frequency response (pass filter, LP filter, LP filter excitation signal) at the formant peaks of speech spectrum . Instead of looking at the modified LPC synthesis , inverse , and high pass filtering in the conventional time-domain technique , we gather information about the poles of the LPC spectrum in the new technique . This post-filtering technique has been used in a 4 kb/s Harmonic Excitation Linear Predictive Coder (HE-LPC) and a subjective listening tests have indicated that this technique outperforms the conventional one in both one and two tandem connections .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (flat frequency response, pass filter) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (high p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
An Adaptive Post-filtering Technique Based On The Modified Yule-Walker Filter . This paper presents an adaptive time-domain post-filtering technique based on the modified Yule-Walker filter . Conventionally , post-filtering is derived from an original LPC spectrum [1] . In general , this time-domain technique produces unpredictable spectral tilt that is hard to control by the modified LPC synthesis , inverse and high pass filter (pass filter, LP filter, LP filter excitation signal) ing and causes unnecessary attenuation or amplification of some frequency components that introduces muffling in speech quality . This effect increases when voice coders are tandemed together . Another approach of designing a post-filter was developed by McAulay and Quatieri [2] which can only be used in sinusoidal based speech coders . We have also developed another new time-domain post-filtering technique . This technique eliminates the problem of spectral tilt in speech spectrum that can be applied to various speech coders . The new post-filter has a flat frequency response (pass filter, LP filter, LP filter excitation signal) at the formant peaks of speech spectrum . Instead of looking at the modified LPC synthesis , inverse , and high pass filtering in the conventional time-domain technique , we gather information about the poles of the LPC spectrum in the new technique . This post-filtering technique has been used in a 4 kb/s Harmonic Excitation Linear Predictive Coder (HE-LPC) and a subjective listening tests have indicated that this technique outperforms the conventional one in both one and two tandem connections .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch (high p) value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
An Adaptive Post-filtering Technique Based On The Modified Yule-Walker Filter . This paper presents an adaptive time-domain post-filtering technique based on the modified Yule-Walker filter . Conventionally , post-filtering is derived from an original LPC spectrum [1] . In general , this time-domain technique produces unpredictable spectral tilt that is hard to control by the modified LPC synthesis , inverse and high p (average pitch, E q) ass filtering and causes unnecessary attenuation or amplification of some frequency components that introduces muffling in speech quality . This effect increases when voice coders are tandemed together . Another approach of designing a post-filter was developed by McAulay and Quatieri [2] which can only be used in sinusoidal based speech coders . We have also developed another new time-domain post-filtering technique . This technique eliminates the problem of spectral tilt in speech spectrum that can be applied to various speech coders . The new post-filter has a flat frequency response at the formant peaks of speech spectrum . Instead of looking at the modified LPC synthesis , inverse , and high pass filtering in the conventional time-domain technique , we gather information about the poles of the LPC spectrum in the new technique . This post-filtering technique has been used in a 4 kb/s Harmonic Excitation Linear Predictive Coder (HE-LPC) and a subjective listening tests have indicated that this technique outperforms the conventional one in both one and two tandem connections .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (flat frequency response, pass filter) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
An Adaptive Post-filtering Technique Based On The Modified Yule-Walker Filter . This paper presents an adaptive time-domain post-filtering technique based on the modified Yule-Walker filter . Conventionally , post-filtering is derived from an original LPC spectrum [1] . In general , this time-domain technique produces unpredictable spectral tilt that is hard to control by the modified LPC synthesis , inverse and high pass filter (pass filter, LP filter, LP filter excitation signal) ing and causes unnecessary attenuation or amplification of some frequency components that introduces muffling in speech quality . This effect increases when voice coders are tandemed together . Another approach of designing a post-filter was developed by McAulay and Quatieri [2] which can only be used in sinusoidal based speech coders . We have also developed another new time-domain post-filtering technique . This technique eliminates the problem of spectral tilt in speech spectrum that can be applied to various speech coders . The new post-filter has a flat frequency response (pass filter, LP filter, LP filter excitation signal) at the formant peaks of speech spectrum . Instead of looking at the modified LPC synthesis , inverse , and high pass filtering in the conventional time-domain technique , we gather information about the poles of the LPC spectrum in the new technique . This post-filtering technique has been used in a 4 kb/s Harmonic Excitation Linear Predictive Coder (HE-LPC) and a subjective listening tests have indicated that this technique outperforms the conventional one in both one and two tandem connections .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (flat frequency response, pass filter) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (high p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
An Adaptive Post-filtering Technique Based On The Modified Yule-Walker Filter . This paper presents an adaptive time-domain post-filtering technique based on the modified Yule-Walker filter . Conventionally , post-filtering is derived from an original LPC spectrum [1] . In general , this time-domain technique produces unpredictable spectral tilt that is hard to control by the modified LPC synthesis , inverse and high pass filter (pass filter, LP filter, LP filter excitation signal) ing and causes unnecessary attenuation or amplification of some frequency components that introduces muffling in speech quality . This effect increases when voice coders are tandemed together . Another approach of designing a post-filter was developed by McAulay and Quatieri [2] which can only be used in sinusoidal based speech coders . We have also developed another new time-domain post-filtering technique . This technique eliminates the problem of spectral tilt in speech spectrum that can be applied to various speech coders . The new post-filter has a flat frequency response (pass filter, LP filter, LP filter excitation signal) at the formant peaks of speech spectrum . Instead of looking at the modified LPC synthesis , inverse , and high pass filtering in the conventional time-domain technique , we gather information about the poles of the LPC spectrum in the new technique . This post-filtering technique has been used in a 4 kb/s Harmonic Excitation Linear Predictive Coder (HE-LPC) and a subjective listening tests have indicated that this technique outperforms the conventional one in both one and two tandem connections .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (flat frequency response, pass filter) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (high p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
An Adaptive Post-filtering Technique Based On The Modified Yule-Walker Filter . This paper presents an adaptive time-domain post-filtering technique based on the modified Yule-Walker filter . Conventionally , post-filtering is derived from an original LPC spectrum [1] . In general , this time-domain technique produces unpredictable spectral tilt that is hard to control by the modified LPC synthesis , inverse and high pass filter (pass filter, LP filter, LP filter excitation signal) ing and causes unnecessary attenuation or amplification of some frequency components that introduces muffling in speech quality . This effect increases when voice coders are tandemed together . Another approach of designing a post-filter was developed by McAulay and Quatieri [2] which can only be used in sinusoidal based speech coders . We have also developed another new time-domain post-filtering technique . This technique eliminates the problem of spectral tilt in speech spectrum that can be applied to various speech coders . The new post-filter has a flat frequency response (pass filter, LP filter, LP filter excitation signal) at the formant peaks of speech spectrum . Instead of looking at the modified LPC synthesis , inverse , and high pass filtering in the conventional time-domain technique , we gather information about the poles of the LPC spectrum in the new technique . This post-filtering technique has been used in a 4 kb/s Harmonic Excitation Linear Predictive Coder (HE-LPC) and a subjective listening tests have indicated that this technique outperforms the conventional one in both one and two tandem connections .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
1997 IEEE WORKSHOP ON SPEECH CODING FOR TELECOMMUNICATIONS, PROCEEDINGS. : 75-76 1997

Publication Year: 1997

A Robust Low Rate Voice Codec For Wireless Communications

Hughes Network Systems

Swaminathan, Nandkumar, Bhaskar, Kowalski, Patel, Zakaria, Li, Prasad, Ieee
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy (Wireless Communication) for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (fixed codebook) per sample for other frames .
A Robust Low Rate Voice Codec For Wireless Communication (signal energy) s . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebook (average energy) s , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (transmission error) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise (LP filter) conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (background noise) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission errors and high quality across changing speech levels and background noise (LP filter) conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise (LP filter) conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (Wireless Communication) for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
A Robust Low Rate Voice Codec For Wireless Communication (signal energy) s . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebook (average energy) s , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (transmission error) , limits to a given value a gain used for scaling the synthesized sound signal .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise (LP filter) conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (background noise) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission errors and high quality across changing speech levels and background noise (LP filter) conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (Wireless Communication) for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
A Robust Low Rate Voice Codec For Wireless Communication (signal energy) s . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebook (average energy) s , and multistage LSF techniques are used , resulting in robustness to transmission errors and high quality across changing speech levels and background noise conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (transmission error) and decoder recovery (transmission error) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
A Robust Low Rate Voice Codec For Wireless Communications . The design , implementation and performance of a high quality low bit rate speech codec for wireless communication is presented . The codec is based on the CELP model . Generalized analysis-by-synthesis , algebraic fixed codebooks , and multistage LSF techniques are used , resulting in robustness to transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s and high quality across changing speech levels and background noise (LP filter) conditions . The bit allocations for the quantization of LSF , pitch and the excitation are chosen in a mode specific manner based on a robust mode classification scheme . A 4 . 8 kb/s version has been implemented and subjective tests show speech quality that is equivalent to or better than most cellular standard codecs . Performance is also consistent across speech levels and transmission errors .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6260009B1

Filed: 1999-02-12     Issued: 2001-07-10

CELP-based to CELP-based vocoder packet translation

(Original Assignee) Qualcomm Inc     (Current Assignee) Qualcomm Inc

Andrew P. DeJaco
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (said model) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6260009B1
CLAIM 4
. The apparatus of claim 1 , wherein said model (first impulse) order converter further comprises : a formant filter coefficient translator that translates said input formant filter coefficients to a third CELP format prior to use by said speech synthesizer to produce third coefficients .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6260009B1
CLAIM 1
. An apparatus for converting a compressed speech packet from one code excited linear prediction (CELP) format to another , comprising : a formant parameter translator that translates input formant filter coefficients having an input CELP format and corresponding to a speech packet to an output CELP format to produce output formant filter coefficients ;
and an excitation parameter translator that translates input pitch and codebook parameters having an input CELP format and corresponding to said speech packet to said output CELP format to produce output pitch and codebook parameters , wherein said excitation parameter translator comprises : a model order converter that converts the model order of said input formant filter coefficients from a model order of said input CELP format to a model order of said output CELP format ;
a time base converter that converts the time base of said input formant filter coefficients from a time base of said input CELP format to a time base of said output CELP format ;
a speech synthesizer that produces a target signal using said input pitch and codebook parameters and said output formant filter coefficients ;
and a searcher that searches for said output codebook and pitch p (E q) arameters using said target signal and said output formant filter coefficients .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6260009B1
CLAIM 1
. An apparatus for converting a compressed speech packet from one code excited linear prediction (CELP) format to another , comprising : a formant parameter translator that translates input formant filter coefficients having an input CELP format and corresponding to a speech packet to an output CELP format to produce output formant filter coefficients ;
and an excitation parameter translator that translates input pitch and codebook parameters having an input CELP format and corresponding to said speech packet to said output CELP format to produce output pitch and codebook parameters , wherein said excitation parameter translator comprises : a model order converter that converts the model order of said input formant filter coefficients from a model order of said input CELP format to a model order of said output CELP format ;
a time base converter that converts the time base of said input formant filter coefficients from a time base of said input CELP format to a time base of said output CELP format ;
a speech synthesizer that produces a target signal using said input pitch and codebook parameters and said output formant filter coefficients ;
and a searcher that searches for said output codebook and pitch p (E q) arameters using said target signal and said output formant filter coefficients .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (said model) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6260009B1
CLAIM 4
. The apparatus of claim 1 , wherein said model (first impulse) order converter further comprises : a formant filter coefficient translator that translates said input formant filter coefficients to a third CELP format prior to use by said speech synthesizer to produce third coefficients .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6260009B1
CLAIM 1
. An apparatus for converting a compressed speech packet from one code excited linear prediction (CELP) format to another , comprising : a formant parameter translator that translates input formant filter coefficients having an input CELP format and corresponding to a speech packet to an output CELP format to produce output formant filter coefficients ;
and an excitation parameter translator that translates input pitch and codebook parameters having an input CELP format and corresponding to said speech packet to said output CELP format to produce output pitch and codebook parameters , wherein said excitation parameter translator comprises : a model order converter that converts the model order of said input formant filter coefficients from a model order of said input CELP format to a model order of said output CELP format ;
a time base converter that converts the time base of said input formant filter coefficients from a time base of said input CELP format to a time base of said output CELP format ;
a speech synthesizer that produces a target signal using said input pitch and codebook parameters and said output formant filter coefficients ;
and a searcher that searches for said output codebook and pitch p (E q) arameters using said target signal and said output formant filter coefficients .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6260009B1
CLAIM 1
. An apparatus for converting a compressed speech packet from one code excited linear prediction (CELP) format to another , comprising : a formant parameter translator that translates input formant filter coefficients having an input CELP format and corresponding to a speech packet to an output CELP format to produce output formant filter coefficients ;
and an excitation parameter translator that translates input pitch and codebook parameters having an input CELP format and corresponding to said speech packet to said output CELP format to produce output pitch and codebook parameters , wherein said excitation parameter translator comprises : a model order converter that converts the model order of said input formant filter coefficients from a model order of said input CELP format to a model order of said output CELP format ;
a time base converter that converts the time base of said input formant filter coefficients from a time base of said input CELP format to a time base of said output CELP format ;
a speech synthesizer that produces a target signal using said input pitch and codebook parameters and said output formant filter coefficients ;
and a searcher that searches for said output codebook and pitch p (E q) arameters using said target signal and said output formant filter coefficients .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6233550B1

Filed: 1998-08-28     Issued: 2001-05-15

Method and apparatus for hybrid coding of speech at 4kbps

(Original Assignee) University of California     (Current Assignee) University of California

Allen Gersho, Eyal Shlomot, Vladimir Cuperman, Chunyan Li
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (successive frames) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6233550B1
CLAIM 6
. A method for encoding speech in an encoder for communication to a decoder for reproduction thereof , said speech comprising a plurality of frames of speech , said method comprising the steps of : (a) classifying each frame of speech into three or more classes wherein one or more of said classes is transitory in character ;
(b) representing the speech in a frame of speech associated with at least one of said classes with a harmonic model ;
(c) computing parameter values of said harmonic model where said parameter values are characteristic of the frame ;
(d) quantizing said parameters for communication to said decoder ;
(e) wherein one or more of said transitory classes is encoded using a coding technique selected from the group consisting of waveform-matching coding , analysis-by-synthesis coding , codebook excited linear prediction analysis-by-synthesis coding , and multipulse analysis-by-synthesis coding ;
and (f) phase aligning the reproduced speech across the boundary between two successive frames (onset frame) of speech where one frame of speech is waveform coded and the other frame of speech is harmonic coded .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6233550B1
CLAIM 12
. A hybrid speech encoder (last frame, replacement frame) , comprising : (a) means for classifying frames of speech signals as voiced , unvoiced , or transitory ;
(b) means for harmonic coding frames associated with at least one of said classes ;
(c) means for coding frames classified as transitory using a coding technique selected from the group consisting of waveform coding , analysis-by-synthesis coding , codebook excited linear prediction analysis-by-synthesis coding , and multipulse analysis-by-synthesis coding ;
and (d) means for phase aligning a harmonic coded frame in a decoder when the preceding frame has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame adjacent to a harmonic coded frame .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6233550B1
CLAIM 12
. A hybrid speech encoder (last frame, replacement frame) , comprising : (a) means for classifying frames of speech signals as voiced , unvoiced , or transitory ;
(b) means for harmonic coding frames associated with at least one of said classes ;
(c) means for coding frames classified as transitory using a coding technique selected from the group consisting of waveform coding , analysis-by-synthesis coding , codebook excited linear prediction analysis-by-synthesis coding , and multipulse analysis-by-synthesis coding ;
and (d) means for phase aligning a harmonic coded frame in a decoder when the preceding frame has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame adjacent to a harmonic coded frame .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6233550B1
CLAIM 12
. A hybrid speech encoder (last frame, replacement frame) , comprising : (a) means for classifying frames of speech signals as voiced , unvoiced , or transitory ;
(b) means for harmonic coding frames associated with at least one of said classes ;
(c) means for coding frames classified as transitory using a coding technique selected from the group consisting of waveform coding , analysis-by-synthesis coding , codebook excited linear prediction analysis-by-synthesis coding , and multipulse analysis-by-synthesis coding ;
and (d) means for phase aligning a harmonic coded frame in a decoder when the preceding frame has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame adjacent to a harmonic coded frame .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (successive frames) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6233550B1
CLAIM 6
. A method for encoding speech in an encoder for communication to a decoder for reproduction thereof , said speech comprising a plurality of frames of speech , said method comprising the steps of : (a) classifying each frame of speech into three or more classes wherein one or more of said classes is transitory in character ;
(b) representing the speech in a frame of speech associated with at least one of said classes with a harmonic model ;
(c) computing parameter values of said harmonic model where said parameter values are characteristic of the frame ;
(d) quantizing said parameters for communication to said decoder ;
(e) wherein one or more of said transitory classes is encoded using a coding technique selected from the group consisting of waveform-matching coding , analysis-by-synthesis coding , codebook excited linear prediction analysis-by-synthesis coding , and multipulse analysis-by-synthesis coding ;
and (f) phase aligning the reproduced speech across the boundary between two successive frames (onset frame) of speech where one frame of speech is waveform coded and the other frame of speech is harmonic coded .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6233550B1
CLAIM 12
. A hybrid speech encoder (last frame, replacement frame) , comprising : (a) means for classifying frames of speech signals as voiced , unvoiced , or transitory ;
(b) means for harmonic coding frames associated with at least one of said classes ;
(c) means for coding frames classified as transitory using a coding technique selected from the group consisting of waveform coding , analysis-by-synthesis coding , codebook excited linear prediction analysis-by-synthesis coding , and multipulse analysis-by-synthesis coding ;
and (d) means for phase aligning a harmonic coded frame in a decoder when the preceding frame has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame adjacent to a harmonic coded frame .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6233550B1
CLAIM 12
. A hybrid speech encoder (last frame, replacement frame) , comprising : (a) means for classifying frames of speech signals as voiced , unvoiced , or transitory ;
(b) means for harmonic coding frames associated with at least one of said classes ;
(c) means for coding frames classified as transitory using a coding technique selected from the group consisting of waveform coding , analysis-by-synthesis coding , codebook excited linear prediction analysis-by-synthesis coding , and multipulse analysis-by-synthesis coding ;
and (d) means for phase aligning a harmonic coded frame in a decoder when the preceding frame has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame adjacent to a harmonic coded frame .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6233550B1
CLAIM 12
. A hybrid speech encoder (last frame, replacement frame) , comprising : (a) means for classifying frames of speech signals as voiced , unvoiced , or transitory ;
(b) means for harmonic coding frames associated with at least one of said classes ;
(c) means for coding frames classified as transitory using a coding technique selected from the group consisting of waveform coding , analysis-by-synthesis coding , codebook excited linear prediction analysis-by-synthesis coding , and multipulse analysis-by-synthesis coding ;
and (d) means for phase aligning a harmonic coded frame in a decoder when the preceding frame has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame adjacent to a harmonic coded frame .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5864798A

Filed: 1996-09-17     Issued: 1999-01-26

Method and apparatus for adjusting a spectrum shape of a speech signal

(Original Assignee) Toshiba Corp     (Current Assignee) Toshiba Corp

Kimio Miseki, Masahiro Oshikiri, Akinobu Yamashita, Masami Akamine, Tadashi Amada
US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (autocorrelation coefficients, second filters) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5864798A
CLAIM 11
. An apparatus for adjusting a spectrum shape of an input speech signal , comprising : a first filter having a pole-zero transfer function which subjects said input speech signal to a spectrum envelop emphasis ;
and a second filter which compensates a spectral tilt of the spectrum shape of the input speech signal caused by said first filter , the second filter including : a calculator which independently derives two filter coefficients from the pole-zero transfer function of said first filter ;
and a filter section which subjects a speech signal output from said first filter to a filtering process using the derived filter coefficients and which compensates said spectral tilt caused by the first filter , wherein said calculator calculates a first parameter corresponding to multiple-order partial autocorrelation coefficients (LP filter, LP filter excitation signal, pass filter) which are approximated to a spectrum envelop of a zero transfer function of said first filter and a second parameter corresponding to multiple-order partial autocorrelation coefficients which are approximated to a spectrum envelop of a pole transfer function of said first filter , said calculator inputs the first parameter and the second parameter to said filter section , and said filter section includes a transfer function which uses the first parameter and the second parameter to compensate the spectral tilt caused by said first filter .

US5864798A
CLAIM 15
. A method for adjusting a spectrum shape of an input speech signal , comprising the steps of : preparing a first filter having a pole-zero transfer function represented by A(z)/B(z) and a second filter for compensating characteristics of the first filter , the second filter having a first-order transfer function represented by (1-μ z Z -1)/(1-μ p Z -1) , where μ z and μ p are respective filter coefficients whose absolute values are smaller than 1 ;
and filtering the speech signal by means of the first and second filters (LP filter, LP filter excitation signal, pass filter) .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (autocorrelation coefficients, second filters) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5864798A
CLAIM 11
. An apparatus for adjusting a spectrum shape of an input speech signal , comprising : a first filter having a pole-zero transfer function which subjects said input speech signal to a spectrum envelop emphasis ;
and a second filter which compensates a spectral tilt of the spectrum shape of the input speech signal caused by said first filter , the second filter including : a calculator which independently derives two filter coefficients from the pole-zero transfer function of said first filter ;
and a filter section which subjects a speech signal output from said first filter to a filtering process using the derived filter coefficients and which compensates said spectral tilt caused by the first filter , wherein said calculator calculates a first parameter corresponding to multiple-order partial autocorrelation coefficients (LP filter, LP filter excitation signal, pass filter) which are approximated to a spectrum envelop of a zero transfer function of said first filter and a second parameter corresponding to multiple-order partial autocorrelation coefficients which are approximated to a spectrum envelop of a pole transfer function of said first filter , said calculator inputs the first parameter and the second parameter to said filter section , and said filter section includes a transfer function which uses the first parameter and the second parameter to compensate the spectral tilt caused by said first filter .

US5864798A
CLAIM 15
. A method for adjusting a spectrum shape of an input speech signal , comprising the steps of : preparing a first filter having a pole-zero transfer function represented by A(z)/B(z) and a second filter for compensating characteristics of the first filter , the second filter having a first-order transfer function represented by (1-μ z Z -1)/(1-μ p Z -1) , where μ z and μ p are respective filter coefficients whose absolute values are smaller than 1 ;
and filtering the speech signal by means of the first and second filters (LP filter, LP filter excitation signal, pass filter) .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (autocorrelation coefficients, second filters) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5864798A
CLAIM 11
. An apparatus for adjusting a spectrum shape of an input speech signal , comprising : a first filter having a pole-zero transfer function which subjects said input speech signal to a spectrum envelop emphasis ;
and a second filter which compensates a spectral tilt of the spectrum shape of the input speech signal caused by said first filter , the second filter including : a calculator which independently derives two filter coefficients from the pole-zero transfer function of said first filter ;
and a filter section which subjects a speech signal output from said first filter to a filtering process using the derived filter coefficients and which compensates said spectral tilt caused by the first filter , wherein said calculator calculates a first parameter corresponding to multiple-order partial autocorrelation coefficients (LP filter, LP filter excitation signal, pass filter) which are approximated to a spectrum envelop of a zero transfer function of said first filter and a second parameter corresponding to multiple-order partial autocorrelation coefficients which are approximated to a spectrum envelop of a pole transfer function of said first filter , said calculator inputs the first parameter and the second parameter to said filter section , and said filter section includes a transfer function which uses the first parameter and the second parameter to compensate the spectral tilt caused by said first filter .

US5864798A
CLAIM 15
. A method for adjusting a spectrum shape of an input speech signal , comprising the steps of : preparing a first filter having a pole-zero transfer function represented by A(z)/B(z) and a second filter for compensating characteristics of the first filter , the second filter having a first-order transfer function represented by (1-μ z Z -1)/(1-μ p Z -1) , where μ z and μ p are respective filter coefficients whose absolute values are smaller than 1 ;
and filtering the speech signal by means of the first and second filters (LP filter, LP filter excitation signal, pass filter) .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (autocorrelation coefficients, second filters) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5864798A
CLAIM 11
. An apparatus for adjusting a spectrum shape of an input speech signal , comprising : a first filter having a pole-zero transfer function which subjects said input speech signal to a spectrum envelop emphasis ;
and a second filter which compensates a spectral tilt of the spectrum shape of the input speech signal caused by said first filter , the second filter including : a calculator which independently derives two filter coefficients from the pole-zero transfer function of said first filter ;
and a filter section which subjects a speech signal output from said first filter to a filtering process using the derived filter coefficients and which compensates said spectral tilt caused by the first filter , wherein said calculator calculates a first parameter corresponding to multiple-order partial autocorrelation coefficients (LP filter, LP filter excitation signal, pass filter) which are approximated to a spectrum envelop of a zero transfer function of said first filter and a second parameter corresponding to multiple-order partial autocorrelation coefficients which are approximated to a spectrum envelop of a pole transfer function of said first filter , said calculator inputs the first parameter and the second parameter to said filter section , and said filter section includes a transfer function which uses the first parameter and the second parameter to compensate the spectral tilt caused by said first filter .

US5864798A
CLAIM 15
. A method for adjusting a spectrum shape of an input speech signal , comprising the steps of : preparing a first filter having a pole-zero transfer function represented by A(z)/B(z) and a second filter for compensating characteristics of the first filter , the second filter having a first-order transfer function represented by (1-μ z Z -1)/(1-μ p Z -1) , where μ z and μ p are respective filter coefficients whose absolute values are smaller than 1 ;
and filtering the speech signal by means of the first and second filters (LP filter, LP filter excitation signal, pass filter) .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (autocorrelation coefficients, second filters) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5864798A
CLAIM 11
. An apparatus for adjusting a spectrum shape of an input speech signal , comprising : a first filter having a pole-zero transfer function which subjects said input speech signal to a spectrum envelop emphasis ;
and a second filter which compensates a spectral tilt of the spectrum shape of the input speech signal caused by said first filter , the second filter including : a calculator which independently derives two filter coefficients from the pole-zero transfer function of said first filter ;
and a filter section which subjects a speech signal output from said first filter to a filtering process using the derived filter coefficients and which compensates said spectral tilt caused by the first filter , wherein said calculator calculates a first parameter corresponding to multiple-order partial autocorrelation coefficients (LP filter, LP filter excitation signal, pass filter) which are approximated to a spectrum envelop of a zero transfer function of said first filter and a second parameter corresponding to multiple-order partial autocorrelation coefficients which are approximated to a spectrum envelop of a pole transfer function of said first filter , said calculator inputs the first parameter and the second parameter to said filter section , and said filter section includes a transfer function which uses the first parameter and the second parameter to compensate the spectral tilt caused by said first filter .

US5864798A
CLAIM 15
. A method for adjusting a spectrum shape of an input speech signal , comprising the steps of : preparing a first filter having a pole-zero transfer function represented by A(z)/B(z) and a second filter for compensating characteristics of the first filter , the second filter having a first-order transfer function represented by (1-μ z Z -1)/(1-μ p Z -1) , where μ z and μ p are respective filter coefficients whose absolute values are smaller than 1 ;
and filtering the speech signal by means of the first and second filters (LP filter, LP filter excitation signal, pass filter) .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (autocorrelation coefficients, second filters) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5864798A
CLAIM 11
. An apparatus for adjusting a spectrum shape of an input speech signal , comprising : a first filter having a pole-zero transfer function which subjects said input speech signal to a spectrum envelop emphasis ;
and a second filter which compensates a spectral tilt of the spectrum shape of the input speech signal caused by said first filter , the second filter including : a calculator which independently derives two filter coefficients from the pole-zero transfer function of said first filter ;
and a filter section which subjects a speech signal output from said first filter to a filtering process using the derived filter coefficients and which compensates said spectral tilt caused by the first filter , wherein said calculator calculates a first parameter corresponding to multiple-order partial autocorrelation coefficients (LP filter, LP filter excitation signal, pass filter) which are approximated to a spectrum envelop of a zero transfer function of said first filter and a second parameter corresponding to multiple-order partial autocorrelation coefficients which are approximated to a spectrum envelop of a pole transfer function of said first filter , said calculator inputs the first parameter and the second parameter to said filter section , and said filter section includes a transfer function which uses the first parameter and the second parameter to compensate the spectral tilt caused by said first filter .

US5864798A
CLAIM 15
. A method for adjusting a spectrum shape of an input speech signal , comprising the steps of : preparing a first filter having a pole-zero transfer function represented by A(z)/B(z) and a second filter for compensating characteristics of the first filter , the second filter having a first-order transfer function represented by (1-μ z Z -1)/(1-μ p Z -1) , where μ z and μ p are respective filter coefficients whose absolute values are smaller than 1 ;
and filtering the speech signal by means of the first and second filters (LP filter, LP filter excitation signal, pass filter) .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5651092A

Filed: 1996-06-27     Issued: 1997-07-22

Method and apparatus for speech encoding, speech decoding, and speech post processing

(Original Assignee) Mitsubishi Electric Corp     (Current Assignee) Mitsubishi Electric Corp

Jun Ishii, Shinya Takahashi
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (predetermined characteristic) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristic (first impulse) s , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristics , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristics , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristics , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristics , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (decoding apparatus) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristics , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristics , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristics , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs (decoding apparatus) , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (predetermined characteristic) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristic (first impulse) s , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristics , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristics , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristics , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristics , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (decoding apparatus) , limits to a given value a gain used for scaling the synthesized sound signal .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristics , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristics , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery (decoding apparatus) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5651092A
CLAIM 1
. A speech decoding apparatus (decoder recovery, decoder constructs) , comprising : (a) harmonics decoding means for receiving encoded amplitude and phase values of a plurality of harmonic components of an input speech signal , for decoding the plurality of harmonic components from the encoded amplitude and phase values and for providing at an output a plurality of decoded harmonic components ;
(b) amplitude suppression means , coupled to the harmonic decoding means , for receiving the plurality of decoded harmonic components , for suppressing masked harmonic components of the plurality of decoded harmonic components and for outputting an amplitude and phase value of harmonic components of the plurality of decoded harmonic components which have not been suppressed ;
the amplitude suppression means including : means for selecting one of the plurality of decoded harmonic components , threshold means for establishing a masking threshold level of the one of the plurality of harmonic components , wherein the masking threshold level is established based on the amplitude of the one of the plurality of decoded harmonic components and a set of predetermined characteristics , and attenuating means for attenuating harmonic components of the plurality of decoded harmonic components having an amplitude less than the masking threshold level ;
and (c) speech synthesis means for synthesizing speech from the amplitude and phase values of harmonic components which have not been suppressed .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
EP0747883A2

Filed: 1996-05-29     Issued: 1996-12-11

Voiced/unvoiced classification of speech for use in speech decoding during frame erasures

(Original Assignee) AT&T Corp; AT&T IPM Corp     (Current Assignee) AT&T Corp

Peter Kroon, Yair Shoham
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (said second portion) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
EP0747883A2
CLAIM 1
A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as periodic or non-periodic ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion (first impulse) if the speech signal is classified as periodic , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as non-periodic .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (fixed codebook) per sample for other frames .
EP0747883A2
CLAIM 1
A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook (average energy) , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as periodic or non-periodic ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion if the speech signal is classified as periodic , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as non-periodic .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
EP0747883A2
CLAIM 5
The method of claim 4 wherein the step of determining the adaptive codebook delay signal comprises i (LP filter) ncrementing the measure of speech signal pitch-period by one or more speech signal sample intervals .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frame) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP0747883A2
CLAIM 1
A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame (current frame, decoder determines concealment) of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as periodic or non-periodic ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion if the speech signal is classified as periodic , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as non-periodic .

EP0747883A2
CLAIM 5
The method of claim 4 wherein the step of determining the adaptive codebook delay signal comprises i (LP filter) ncrementing the measure of speech signal pitch-period by one or more speech signal sample intervals .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frame) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP0747883A2
CLAIM 1
A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame (current frame, decoder determines concealment) of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as periodic or non-periodic ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion if the speech signal is classified as periodic , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as non-periodic .

EP0747883A2
CLAIM 5
The method of claim 4 wherein the step of determining the adaptive codebook delay signal comprises i (LP filter) ncrementing the measure of speech signal pitch-period by one or more speech signal sample intervals .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (said second portion) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
EP0747883A2
CLAIM 1
A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as periodic or non-periodic ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion (first impulse) if the speech signal is classified as periodic , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as non-periodic .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
EP0747883A2
CLAIM 1
A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook (average energy) , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as periodic or non-periodic ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion if the speech signal is classified as periodic , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as non-periodic .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
EP0747883A2
CLAIM 5
The method of claim 4 wherein the step of determining the adaptive codebook delay signal comprises i (LP filter) ncrementing the measure of speech signal pitch-period by one or more speech signal sample intervals .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frame) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP0747883A2
CLAIM 1
A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame (current frame, decoder determines concealment) of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as periodic or non-periodic ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion if the speech signal is classified as periodic , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as non-periodic .

EP0747883A2
CLAIM 5
The method of claim 4 wherein the step of determining the adaptive codebook delay signal comprises i (LP filter) ncrementing the measure of speech signal pitch-period by one or more speech signal sample intervals .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
EP0747883A2
CLAIM 1
A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook (average energy) , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as periodic or non-periodic ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion if the speech signal is classified as periodic , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as non-periodic .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frame) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
EP0747883A2
CLAIM 1
A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame (current frame, decoder determines concealment) of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as periodic or non-periodic ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion if the speech signal is classified as periodic , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as non-periodic .

EP0747883A2
CLAIM 5
The method of claim 4 wherein the step of determining the adaptive codebook delay signal comprises i (LP filter) ncrementing the measure of speech signal pitch-period by one or more speech signal sample intervals .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5701392A

Filed: 1995-07-31     Issued: 1997-12-23

Depth-first algebraic-codebook search for fast coding of speech

(Original Assignee) Universite de Sherbrooke     (Current Assignee) Universite de Sherbrooke

Jean-Pierre Adoul, Claude Laflamme
US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number, last non) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5701392A
CLAIM 1
. A method of encoding a sound signal , comprising the steps of : providing a codebook circuit for forming a codebook including a set of codevectors A k each defining a plurality of different positions p and comprising N non-zero-amplitude pulses each assignable to predetermined valid positions p of the codevector ;
providing a device for conducting in said codebook a depth-first search involving a tree structure defining a number M of ordered levels , each level m being associated with a predetermined number (last non) N m of non-zero-amplitude pulses , N m ≧1 , wherein the sum of said predetermined numbers associated with all said M levels is equal to the number N of the non-zero-amplitude pulses comprised in said codevectors , each level m of the tree structure being further associated with a path building operation , with a given pulse-order rule and with a given selection criterion ;
wherein : in a level 1 of the tree structure , the associated path-building operation comprises the following substeps : choosing a number N 1 of said N non-zero-amplitude pulses in relation to the associated pulse-order rule ;
selecting at least one of the valid positions p of said N 1 non-zero-amplitude pulses in relation to the associated selection criterion to define at least one level-1 candidate path ;
in a level m of the tree structure , the associated path-building operation defines recursively a level-m candidate path by extending a level-(m-1) candidate path through the following substeps : choosing N m of said non-zero-amplitude pulses not previously chosen in the course of building said level-(m-1) path in relation to the associated pulse-order rule ;
selecting at least one of the valid positions p of said N m non-zero-amplitude pulses in relation to the associated selection criterion to form at least one level-m candidate path ;
and wherein a level-M candidate path originated at a level-1 and extended during the path-building operations associated with subsequent levels of the tree structure determines the respective positions p of the N non-zero-amplitude pulses of a codevector and thereby defines a candidate codevector A k .

US5701392A
CLAIM 11
. A sound signal encoding method as recited in claim 2 , wherein said N non-zero-amplitude pulses have respective indexes , and wherein , in each said subsequent search level of the tree structure , the step of choosing at least one of said non-zero-amplitude pulses not previously chosen in relation to the associated pulse-ordering function comprises laying out the indexes of the pulses not previously chosen on a circle and choosing said at least one non-zero-amplitude pulse in accordance with a clockwise sequence of the indexes starting at the right of the last non (last non) -zero-amplitude pulse selected in the former search level of the tree structure .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (order r) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number, last non) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5701392A
CLAIM 1
. A method of encoding a sound signal , comprising the steps of : providing a codebook circuit for forming a codebook including a set of codevectors A k each defining a plurality of different positions p and comprising N non-zero-amplitude pulses each assignable to predetermined valid positions p of the codevector ;
providing a device for conducting in said codebook a depth-first search involving a tree structure defining a number M of ordered levels , each level m being associated with a predetermined number (last non) N m of non-zero-amplitude pulses , N m ≧1 , wherein the sum of said predetermined numbers associated with all said M levels is equal to the number N of the non-zero-amplitude pulses comprised in said codevectors , each level m of the tree structure being further associated with a path building operation , with a given pulse-order r (E q) ule and with a given selection criterion ;
wherein : in a level 1 of the tree structure , the associated path-building operation comprises the following substeps : choosing a number N 1 of said N non-zero-amplitude pulses in relation to the associated pulse-order rule ;
selecting at least one of the valid positions p of said N 1 non-zero-amplitude pulses in relation to the associated selection criterion to define at least one level-1 candidate path ;
in a level m of the tree structure , the associated path-building operation defines recursively a level-m candidate path by extending a level-(m-1) candidate path through the following substeps : choosing N m of said non-zero-amplitude pulses not previously chosen in the course of building said level-(m-1) path in relation to the associated pulse-order rule ;
selecting at least one of the valid positions p of said N m non-zero-amplitude pulses in relation to the associated selection criterion to form at least one level-m candidate path ;
and wherein a level-M candidate path originated at a level-1 and extended during the path-building operations associated with subsequent levels of the tree structure determines the respective positions p of the N non-zero-amplitude pulses of a codevector and thereby defines a candidate codevector A k .

US5701392A
CLAIM 11
. A sound signal encoding method as recited in claim 2 , wherein said N non-zero-amplitude pulses have respective indexes , and wherein , in each said subsequent search level of the tree structure , the step of choosing at least one of said non-zero-amplitude pulses not previously chosen in relation to the associated pulse-ordering function comprises laying out the indexes of the pulses not previously chosen on a circle and choosing said at least one non-zero-amplitude pulse in accordance with a clockwise sequence of the indexes starting at the right of the last non (last non) -zero-amplitude pulse selected in the former search level of the tree structure .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (order r) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number, last non) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5701392A
CLAIM 1
. A method of encoding a sound signal , comprising the steps of : providing a codebook circuit for forming a codebook including a set of codevectors A k each defining a plurality of different positions p and comprising N non-zero-amplitude pulses each assignable to predetermined valid positions p of the codevector ;
providing a device for conducting in said codebook a depth-first search involving a tree structure defining a number M of ordered levels , each level m being associated with a predetermined number (last non) N m of non-zero-amplitude pulses , N m ≧1 , wherein the sum of said predetermined numbers associated with all said M levels is equal to the number N of the non-zero-amplitude pulses comprised in said codevectors , each level m of the tree structure being further associated with a path building operation , with a given pulse-order r (E q) ule and with a given selection criterion ;
wherein : in a level 1 of the tree structure , the associated path-building operation comprises the following substeps : choosing a number N 1 of said N non-zero-amplitude pulses in relation to the associated pulse-order rule ;
selecting at least one of the valid positions p of said N 1 non-zero-amplitude pulses in relation to the associated selection criterion to define at least one level-1 candidate path ;
in a level m of the tree structure , the associated path-building operation defines recursively a level-m candidate path by extending a level-(m-1) candidate path through the following substeps : choosing N m of said non-zero-amplitude pulses not previously chosen in the course of building said level-(m-1) path in relation to the associated pulse-order rule ;
selecting at least one of the valid positions p of said N m non-zero-amplitude pulses in relation to the associated selection criterion to form at least one level-m candidate path ;
and wherein a level-M candidate path originated at a level-1 and extended during the path-building operations associated with subsequent levels of the tree structure determines the respective positions p of the N non-zero-amplitude pulses of a codevector and thereby defines a candidate codevector A k .

US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US5701392A
CLAIM 11
. A sound signal encoding method as recited in claim 2 , wherein said N non-zero-amplitude pulses have respective indexes , and wherein , in each said subsequent search level of the tree structure , the step of choosing at least one of said non-zero-amplitude pulses not previously chosen in relation to the associated pulse-ordering function comprises laying out the indexes of the pulses not previously chosen on a circle and choosing said at least one non-zero-amplitude pulse in accordance with a clockwise sequence of the indexes starting at the right of the last non (last non) -zero-amplitude pulse selected in the former search level of the tree structure .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number, last non) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5701392A
CLAIM 1
. A method of encoding a sound signal , comprising the steps of : providing a codebook circuit for forming a codebook including a set of codevectors A k each defining a plurality of different positions p and comprising N non-zero-amplitude pulses each assignable to predetermined valid positions p of the codevector ;
providing a device for conducting in said codebook a depth-first search involving a tree structure defining a number M of ordered levels , each level m being associated with a predetermined number (last non) N m of non-zero-amplitude pulses , N m ≧1 , wherein the sum of said predetermined numbers associated with all said M levels is equal to the number N of the non-zero-amplitude pulses comprised in said codevectors , each level m of the tree structure being further associated with a path building operation , with a given pulse-order rule and with a given selection criterion ;
wherein : in a level 1 of the tree structure , the associated path-building operation comprises the following substeps : choosing a number N 1 of said N non-zero-amplitude pulses in relation to the associated pulse-order rule ;
selecting at least one of the valid positions p of said N 1 non-zero-amplitude pulses in relation to the associated selection criterion to define at least one level-1 candidate path ;
in a level m of the tree structure , the associated path-building operation defines recursively a level-m candidate path by extending a level-(m-1) candidate path through the following substeps : choosing N m of said non-zero-amplitude pulses not previously chosen in the course of building said level-(m-1) path in relation to the associated pulse-order rule ;
selecting at least one of the valid positions p of said N m non-zero-amplitude pulses in relation to the associated selection criterion to form at least one level-m candidate path ;
and wherein a level-M candidate path originated at a level-1 and extended during the path-building operations associated with subsequent levels of the tree structure determines the respective positions p of the N non-zero-amplitude pulses of a codevector and thereby defines a candidate codevector A k .

US5701392A
CLAIM 11
. A sound signal encoding method as recited in claim 2 , wherein said N non-zero-amplitude pulses have respective indexes , and wherein , in each said subsequent search level of the tree structure , the step of choosing at least one of said non-zero-amplitude pulses not previously chosen in relation to the associated pulse-ordering function comprises laying out the indexes of the pulses not previously chosen on a circle and choosing said at least one non-zero-amplitude pulse in accordance with a clockwise sequence of the indexes starting at the right of the last non (last non) -zero-amplitude pulse selected in the former search level of the tree structure .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (order r) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number, last non) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5701392A
CLAIM 1
. A method of encoding a sound signal , comprising the steps of : providing a codebook circuit for forming a codebook including a set of codevectors A k each defining a plurality of different positions p and comprising N non-zero-amplitude pulses each assignable to predetermined valid positions p of the codevector ;
providing a device for conducting in said codebook a depth-first search involving a tree structure defining a number M of ordered levels , each level m being associated with a predetermined number (last non) N m of non-zero-amplitude pulses , N m ≧1 , wherein the sum of said predetermined numbers associated with all said M levels is equal to the number N of the non-zero-amplitude pulses comprised in said codevectors , each level m of the tree structure being further associated with a path building operation , with a given pulse-order r (E q) ule and with a given selection criterion ;
wherein : in a level 1 of the tree structure , the associated path-building operation comprises the following substeps : choosing a number N 1 of said N non-zero-amplitude pulses in relation to the associated pulse-order rule ;
selecting at least one of the valid positions p of said N 1 non-zero-amplitude pulses in relation to the associated selection criterion to define at least one level-1 candidate path ;
in a level m of the tree structure , the associated path-building operation defines recursively a level-m candidate path by extending a level-(m-1) candidate path through the following substeps : choosing N m of said non-zero-amplitude pulses not previously chosen in the course of building said level-(m-1) path in relation to the associated pulse-order rule ;
selecting at least one of the valid positions p of said N m non-zero-amplitude pulses in relation to the associated selection criterion to form at least one level-m candidate path ;
and wherein a level-M candidate path originated at a level-1 and extended during the path-building operations associated with subsequent levels of the tree structure determines the respective positions p of the N non-zero-amplitude pulses of a codevector and thereby defines a candidate codevector A k .

US5701392A
CLAIM 11
. A sound signal encoding method as recited in claim 2 , wherein said N non-zero-amplitude pulses have respective indexes , and wherein , in each said subsequent search level of the tree structure , the step of choosing at least one of said non-zero-amplitude pulses not previously chosen in relation to the associated pulse-ordering function comprises laying out the indexes of the pulses not previously chosen on a circle and choosing said at least one non-zero-amplitude pulse in accordance with a clockwise sequence of the indexes starting at the right of the last non (last non) -zero-amplitude pulse selected in the former search level of the tree structure .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (order r) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number, last non) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5701392A
CLAIM 1
. A method of encoding a sound signal , comprising the steps of : providing a codebook circuit for forming a codebook including a set of codevectors A k each defining a plurality of different positions p and comprising N non-zero-amplitude pulses each assignable to predetermined valid positions p of the codevector ;
providing a device for conducting in said codebook a depth-first search involving a tree structure defining a number M of ordered levels , each level m being associated with a predetermined number (last non) N m of non-zero-amplitude pulses , N m ≧1 , wherein the sum of said predetermined numbers associated with all said M levels is equal to the number N of the non-zero-amplitude pulses comprised in said codevectors , each level m of the tree structure being further associated with a path building operation , with a given pulse-order r (E q) ule and with a given selection criterion ;
wherein : in a level 1 of the tree structure , the associated path-building operation comprises the following substeps : choosing a number N 1 of said N non-zero-amplitude pulses in relation to the associated pulse-order rule ;
selecting at least one of the valid positions p of said N 1 non-zero-amplitude pulses in relation to the associated selection criterion to define at least one level-1 candidate path ;
in a level m of the tree structure , the associated path-building operation defines recursively a level-m candidate path by extending a level-(m-1) candidate path through the following substeps : choosing N m of said non-zero-amplitude pulses not previously chosen in the course of building said level-(m-1) path in relation to the associated pulse-order rule ;
selecting at least one of the valid positions p of said N m non-zero-amplitude pulses in relation to the associated selection criterion to form at least one level-m candidate path ;
and wherein a level-M candidate path originated at a level-1 and extended during the path-building operations associated with subsequent levels of the tree structure determines the respective positions p of the N non-zero-amplitude pulses of a codevector and thereby defines a candidate codevector A k .

US5701392A
CLAIM 5
. A sound signal encoding method as recited in claim 2 , wherein , in each said subsequent search level of the tree structure , the selecting step (signal classification parameter) comprises : calculating a given mathematical ratio for each path defined by the pulse position(s) p selected in the former search level(s) and extended by each valid position p of said at least one pulse of the subset associated to said subsequent search level ;
and retaining the extended path defined by the pulse positions p that maximize said given ratio .

US5701392A
CLAIM 11
. A sound signal encoding method as recited in claim 2 , wherein said N non-zero-amplitude pulses have respective indexes , and wherein , in each said subsequent search level of the tree structure , the step of choosing at least one of said non-zero-amplitude pulses not previously chosen in relation to the associated pulse-ordering function comprises laying out the indexes of the pulses not previously chosen on a circle and choosing said at least one non-zero-amplitude pulse in accordance with a clockwise sequence of the indexes starting at the right of the last non (last non) -zero-amplitude pulse selected in the former search level of the tree structure .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5754976A

Filed: 1995-07-28     Issued: 1998-05-19

Algebraic codebook with signal-selected pulse amplitude/position combinations for fast coding of speech

(Original Assignee) Universite de Sherbrooke     (Current Assignee) Universite de Sherbrooke

Jean-Pierre Adoul, Claude Laflamme
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (following inequality) from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US5754976A
CLAIM 8
. The method of claim 7 , wherein the step of maximizing said given ratio comprises the step of skipping at least the innermost loop of the N nested loops whenever the following inequality (average pitch value) is true ##EQU32## where S p . sbsb . n is the amplitude pre-assigned to position p n , D p . sbsb . n is the p n th component of the target signal D , and T D is a threshold related to the backward-filtered target signal D .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (sound signal) is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (sound signal) is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (sound signal) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (following inequality) from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US5754976A
CLAIM 8
. The method of claim 7 , wherein the step of maximizing said given ratio comprises the step of skipping at least the innermost loop of the N nested loops whenever the following inequality (average pitch value) is true ##EQU32## where S p . sbsb . n is the amplitude pre-assigned to position p n , D p . sbsb . n is the p n th component of the target signal D , and T D is a threshold related to the backward-filtered target signal D .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (sound signal) is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (sound signal) is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (sound signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (sound signal) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5754976A
CLAIM 1
. A method of conducting a search in a codebook in view of encoding a sound signal (sound signal) , said codebook consisting of a set of pulse amplitude/position combinations , each pulse amplitude/position combination defining L different positions and comprising both zero-amplitude pulses and non-zero-amplitude pulses assigned to respective positions p=1 , 2 , . . . L of the combination , and each non-zero-amplitude pulse assuming at least one of q possible amplitudes , said method comprising the steps of : pre-selecting from said codebook a subset of pulse amplitude/position combinations in relation to the sound signal ;
searching only said subset of pulse amplitude/position combinations in view of encoding the sound signal whereby complexity of the search is reduced as only a subset of the pulse amplitude/position combinations of the codebook is searched ;
and wherein the pre-selecting step (signal classification parameter) comprises pre-establishing , in relation to the sound signal , a function S p pre-assigning to the positions p=1 , 2 , . . . L valid amplitudes out of said q possible amplitudes , and wherein the searching step comprises searching only the pulse amplitude/position combinations of said codebook having non-zero-amplitude pulses which respect the pre-established function .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5664055A

Filed: 1995-06-07     Issued: 1997-09-02

CS-ACELP speech compression system with adaptive pitch prediction filter gain based on a measure of periodicity

(Original Assignee) Nokia of America Corp     (Current Assignee) BlackBerry Ltd

Peter Kroon
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (lower limit) from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5664055A
CLAIM 4
. The method of claim 1 wherein the signal reflecting the adaptive codebook gain comprises values which are greater than or equal to a lower limit (average pitch value) and less than or equal to an upper limit .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signals and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signals and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (second output signal, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (fixed codebook) per sample for other frames .
US5664055A
CLAIM 1
. A method for use in a speech processing system which includes a first portion comprising an adaptive codebook and corresponding adaptive codebook amplifier and a second portion comprising a fixed codebook (average energy) coupled to a pitch filter , the pitch filter comprising a delay memory coupled to a pitch filter amplifier , the method comprising : determining the pitch filter gain based on a measure of periodicity of a speech signal (speech signal, decoder determines concealment) ;
and amplifying samples of a signal in said pitch filter based on said determined pitch filter gain .

US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signal (speech signal, decoder determines concealment) s and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signals and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US5664055A
CLAIM 14
. The speech processing system of claim 7 wherein the speech processing system is used in a speech encoder (last frame, replacement frame) .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (second output signal, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5664055A
CLAIM 1
. A method for use in a speech processing system which includes a first portion comprising an adaptive codebook and corresponding adaptive codebook amplifier and a second portion comprising a fixed codebook coupled to a pitch filter , the pitch filter comprising a delay memory coupled to a pitch filter amplifier , the method comprising : determining the pitch filter gain based on a measure of periodicity of a speech signal (speech signal, decoder determines concealment) ;
and amplifying samples of a signal in said pitch filter based on said determined pitch filter gain .

US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signal (speech signal, decoder determines concealment) s and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (second output signal, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5664055A
CLAIM 1
. A method for use in a speech processing system which includes a first portion comprising an adaptive codebook and corresponding adaptive codebook amplifier and a second portion comprising a fixed codebook coupled to a pitch filter , the pitch filter comprising a delay memory coupled to a pitch filter amplifier , the method comprising : determining the pitch filter gain based on a measure of periodicity of a speech signal (speech signal, decoder determines concealment) ;
and amplifying samples of a signal in said pitch filter based on said determined pitch filter gain .

US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signal (speech signal, decoder determines concealment) s and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signals and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US5664055A
CLAIM 14
. The speech processing system of claim 7 wherein the speech processing system is used in a speech encoder (last frame, replacement frame) .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signals and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signals and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signals and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US5664055A
CLAIM 14
. The speech processing system of claim 7 wherein the speech processing system is used in a speech encoder (last frame, replacement frame) .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (lower limit) from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5664055A
CLAIM 4
. The method of claim 1 wherein the signal reflecting the adaptive codebook gain comprises values which are greater than or equal to a lower limit (average pitch value) and less than or equal to an upper limit .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signals and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signals and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (second output signal, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US5664055A
CLAIM 1
. A method for use in a speech processing system which includes a first portion comprising an adaptive codebook and corresponding adaptive codebook amplifier and a second portion comprising a fixed codebook (average energy) coupled to a pitch filter , the pitch filter comprising a delay memory coupled to a pitch filter amplifier , the method comprising : determining the pitch filter gain based on a measure of periodicity of a speech signal (speech signal, decoder determines concealment) ;
and amplifying samples of a signal in said pitch filter based on said determined pitch filter gain .

US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signal (speech signal, decoder determines concealment) s and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signals and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US5664055A
CLAIM 14
. The speech processing system of claim 7 wherein the speech processing system is used in a speech encoder (last frame, replacement frame) .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (second output signal, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5664055A
CLAIM 1
. A method for use in a speech processing system which includes a first portion comprising an adaptive codebook and corresponding adaptive codebook amplifier and a second portion comprising a fixed codebook coupled to a pitch filter , the pitch filter comprising a delay memory coupled to a pitch filter amplifier , the method comprising : determining the pitch filter gain based on a measure of periodicity of a speech signal (speech signal, decoder determines concealment) ;
and amplifying samples of a signal in said pitch filter based on said determined pitch filter gain .

US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signal (speech signal, decoder determines concealment) s and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (second output signal, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5664055A
CLAIM 1
. A method for use in a speech processing system which includes a first portion comprising an adaptive codebook and corresponding adaptive codebook amplifier and a second portion comprising a fixed codebook coupled to a pitch filter , the pitch filter comprising a delay memory coupled to a pitch filter amplifier , the method comprising : determining the pitch filter gain based on a measure of periodicity of a speech signal (speech signal, decoder determines concealment) ;
and amplifying samples of a signal in said pitch filter based on said determined pitch filter gain .

US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signal (speech signal, decoder determines concealment) s and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signals and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US5664055A
CLAIM 14
. The speech processing system of claim 7 wherein the speech processing system is used in a speech encoder (last frame, replacement frame) .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signals and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signals and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (second output signal, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US5664055A
CLAIM 1
. A method for use in a speech processing system which includes a first portion comprising an adaptive codebook and corresponding adaptive codebook amplifier and a second portion comprising a fixed codebook (average energy) coupled to a pitch filter , the pitch filter comprising a delay memory coupled to a pitch filter amplifier , the method comprising : determining the pitch filter gain based on a measure of periodicity of a speech signal (speech signal, decoder determines concealment) ;
and amplifying samples of a signal in said pitch filter based on said determined pitch filter gain .

US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signal (speech signal, decoder determines concealment) s and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5664055A
CLAIM 12
. The speech processing system of claim 7 wherein said first and second portions generate first and second output signals and wherein the system further comprises : means for summing the first . and second output signals ;
and a linear prediction filter (signal classification parameter) , coupled the means for summing , for generating a speech signal in response to the summed first and second signals .

US5664055A
CLAIM 14
. The speech processing system of claim 7 wherein the speech processing system is used in a speech encoder (last frame, replacement frame) .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5699485A

Filed: 1995-06-07     Issued: 1997-12-16

Pitch delay modification during frame erasures

(Original Assignee) Nokia of America Corp     (Current Assignee) BlackBerry Ltd

Yair Shoham
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 3
. The method of claim 2 wherein the step of incrementing comprises i (LP filter) ncrementing a number of samples representing a pitch-period .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5699485A
CLAIM 3
. The method of claim 2 wherein the step of incrementing comprises i (LP filter) ncrementing a number of samples representing a pitch-period .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook, speech signal) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 3
. The method of claim 2 wherein the step of incrementing comprises i (LP filter) ncrementing a number of samples representing a pitch-period .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 3
. The method of claim 2 wherein the step of incrementing comprises i (LP filter) ncrementing a number of samples representing a pitch-period .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5699485A
CLAIM 3
. The method of claim 2 wherein the step of incrementing comprises i (LP filter) ncrementing a number of samples representing a pitch-period .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook, speech signal) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5699485A
CLAIM 1
. A method for use in a speech decoder which falls to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

US5699485A
CLAIM 3
. The method of claim 2 wherein the step of incrementing comprises i (LP filter) ncrementing a number of samples representing a pitch-period .

US5699485A
CLAIM 5
. A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5732389A

Filed: 1995-06-07     Issued: 1998-03-24

Voiced/unvoiced classification of speech for excitation codebook selection in celp speech decoding during frame erasures

(Original Assignee) Nokia of America Corp     (Current Assignee) Nokia of America Corp

Peter Kroon, Yair Shoham
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (said second portion) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5732389A
CLAIM 1
. A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as representing periodic speech or as representing non-periodic speech ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion (first impulse) if the speech signal is classified as representing periodic speech , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as representing non-periodic speech .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (fixed codebook) per sample for other frames .
US5732389A
CLAIM 1
. A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook (average energy) , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as representing periodic speech or as representing non-periodic speech ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion if the speech signal is classified as representing periodic speech , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as representing non-periodic speech .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5732389A
CLAIM 5
. The method of claim 4 wherein the step of determining the adaptive codebook delay signal comprises i (LP filter) ncrementing the measure of speech signal pitch-period by one or more speech signal sample intervals .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frame) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5732389A
CLAIM 1
. A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame (current frame, decoder determines concealment) of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as representing periodic speech or as representing non-periodic speech ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion if the speech signal is classified as representing periodic speech , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as representing non-periodic speech .

US5732389A
CLAIM 5
. The method of claim 4 wherein the step of determining the adaptive codebook delay signal comprises i (LP filter) ncrementing the measure of speech signal pitch-period by one or more speech signal sample intervals .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frame) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5732389A
CLAIM 1
. A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame (current frame, decoder determines concealment) of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as representing periodic speech or as representing non-periodic speech ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion if the speech signal is classified as representing periodic speech , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as representing non-periodic speech .

US5732389A
CLAIM 5
. The method of claim 4 wherein the step of determining the adaptive codebook delay signal comprises i (LP filter) ncrementing the measure of speech signal pitch-period by one or more speech signal sample intervals .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (said second portion) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5732389A
CLAIM 1
. A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as representing periodic speech or as representing non-periodic speech ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion (first impulse) if the speech signal is classified as representing periodic speech , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as representing non-periodic speech .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US5732389A
CLAIM 1
. A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook (average energy) , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as representing periodic speech or as representing non-periodic speech ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion if the speech signal is classified as representing periodic speech , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as representing non-periodic speech .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5732389A
CLAIM 5
. The method of claim 4 wherein the step of determining the adaptive codebook delay signal comprises i (LP filter) ncrementing the measure of speech signal pitch-period by one or more speech signal sample intervals .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frame) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5732389A
CLAIM 1
. A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame (current frame, decoder determines concealment) of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as representing periodic speech or as representing non-periodic speech ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion if the speech signal is classified as representing periodic speech , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as representing non-periodic speech .

US5732389A
CLAIM 5
. The method of claim 4 wherein the step of determining the adaptive codebook delay signal comprises i (LP filter) ncrementing the measure of speech signal pitch-period by one or more speech signal sample intervals .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US5732389A
CLAIM 1
. A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook (average energy) , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as representing periodic speech or as representing non-periodic speech ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion if the speech signal is classified as representing periodic speech , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as representing non-periodic speech .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frame) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5732389A
CLAIM 1
. A method for use in a speech decoder which includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook , said decoder generating a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame (current frame, decoder determines concealment) of compressed speech information , the method comprising : classifying a speech signal to be generated by the decoder as representing periodic speech or as representing non-periodic speech ;
based on the classification of the speech signal , either generating said excitation signal based on the output signal from said first portion and not on the output signal from said second portion if the speech signal is classified as representing periodic speech , or generating said excitation signal based on the output signal from said second portion and not on the output signal from said first portion if the speech signal is classified as representing non-periodic speech .

US5732389A
CLAIM 5
. The method of claim 4 wherein the step of determining the adaptive codebook delay signal comprises i (LP filter) ncrementing the measure of speech signal pitch-period by one or more speech signal sample intervals .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5699482A

Filed: 1995-05-11     Issued: 1997-12-16

Fast sparse-algebraic-codebook search for efficient speech coding

(Original Assignee) Universite de Sherbrooke     (Current Assignee) Universite de Sherbrooke

Jean-Pierre Adoul, Claude Laflamme
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse response) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5699482A
CLAIM 1
. A method of calculating an index k for encoding a sound signal according to a Code-Excited Linear Prediction technique using a sparse algebraic code to generate an algebraic codeword in the form of an L-sample long waveform comprising a small number N of non-zero pulses each of which is assignable to different positions in the waveform to thereby enable composition of several of algebraic codewords A k , said index calculating method comprising the steps of : (a) calculating a target ratio (DA . sub . k . sup . T /α . sub . k) . sup . 2 for each algebraic codeword among a plurality of said algebraic codewords A k ;
(b) determining the largest ratio among said calculated target ratios ;
and (c) extracting the index k corresponding to the largest calculated target ratio ;
wherein , because of the algebraic-code sparsity , the computation involved in said step of calculating a target ratio is reduced to the sum of only N and N(N+1)/2 terms for the numerator and denominator , respectively , namely ##EQU10## where : i=1 , 2 , . . . N ;
S(i) is the amplitude of the i th non-zero pulse of the algebraic codeword A k ;
D is a backward-filtered version of an L-sample block of said sound signal ;
p i is the position of the i th non-zero pulse of the algebraic codeword A k ;
p j is the position of the j th non-zero pulse of the algebraic codeword A k ;
and U is a Toeplitz matrix of autocorrelation terms defined by the following equation : ##EQU11## where : m=1 , 2 , . . . L ;
and h(n) is the impulse response (impulse responses, impulse response, LP filter) of a transfer function H varying in time with parameters representative of spectral characteristics of said sound signal and taking into account long term prediction parameters characterizing a periodicity of said sound signal .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5699482A
CLAIM 1
. A method of calculating an index k for encoding a sound signal according to a Code-Excited Linear Prediction technique using a sparse algebraic code to generate an algebraic codeword in the form of an L-sample long waveform comprising a small number N of non-zero pulses each of which is assignable to different positions in the waveform to thereby enable composition of several of algebraic codewords A k , said index calculating method comprising the steps of : (a) calculating a target ratio (DA . sub . k . sup . T /α . sub . k) . sup . 2 for each algebraic codeword among a plurality of said algebraic codewords A k ;
(b) determining the largest ratio among said calculated target ratios ;
and (c) extracting the index k corresponding to the largest calculated target ratio ;
wherein , because of the algebraic-code sparsity , the computation involved in said step of calculating a target ratio is reduced to the sum of only N and N(N+1)/2 terms for the numerator and denominator , respectively , namely ##EQU10## where : i=1 , 2 , . . . N ;
S(i) is the amplitude of the i th non-zero pulse of the algebraic codeword A k ;
D is a backward-filtered version of an L-sample block of said sound signal ;
p i is the position of the i th non-zero pulse of the algebraic codeword A k ;
p j is the position of the j th non-zero pulse of the algebraic codeword A k ;
and U is a Toeplitz matrix of autocorrelation terms defined by the following equation : ##EQU11## where : m=1 , 2 , . . . L ;
and h(n) is the impulse response (impulse responses, impulse response, LP filter) of a transfer function H varying in time with parameters representative of spectral characteristics of said sound signal and taking into account long term prediction parameters characterizing a periodicity of said sound signal .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (impulse response) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse response) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5699482A
CLAIM 1
. A method of calculating an index k for encoding a sound signal according to a Code-Excited Linear Prediction technique using a sparse algebraic code to generate an algebraic codeword in the form of an L-sample long waveform comprising a small number N of non-zero pulses each of which is assignable to different positions in the waveform to thereby enable composition of several of algebraic codewords A k , said index calculating method comprising the steps of : (a) calculating a target ratio (DA . sub . k . sup . T /α . sub . k) . sup . 2 for each algebraic codeword among a plurality of said algebraic codewords A k ;
(b) determining the largest ratio among said calculated target ratios ;
and (c) extracting the index k corresponding to the largest calculated target ratio ;
wherein , because of the algebraic-code sparsity , the computation involved in said step of calculating a target ratio is reduced to the sum of only N and N(N+1)/2 terms for the numerator and denominator , respectively , namely ##EQU10## where : i=1 , 2 , . . . N ;
S(i) is the amplitude of the i th non-zero pulse of the algebraic codeword A k ;
D is a backward-filtered version of an L-sample block of said sound signal ;
p i is the position of the i th non-zero pulse of the algebraic codeword A k ;
p j is the position of the j th non-zero pulse of the algebraic codeword A k ;
and U is a Toeplitz matrix of autocorrelation terms defined by the following equation : ##EQU11## where : m=1 , 2 , . . . L ;
and h(n) is the impulse response (impulse responses, impulse response, LP filter) of a transfer function H varying in time with parameters representative of spectral characteristics of said sound signal and taking into account long term prediction parameters characterizing a periodicity of said sound signal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse response) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5699482A
CLAIM 1
. A method of calculating an index k for encoding a sound signal according to a Code-Excited Linear Prediction technique using a sparse algebraic code to generate an algebraic codeword in the form of an L-sample long waveform comprising a small number N of non-zero pulses each of which is assignable to different positions in the waveform to thereby enable composition of several of algebraic codewords A k , said index calculating method comprising the steps of : (a) calculating a target ratio (DA . sub . k . sup . T /α . sub . k) . sup . 2 for each algebraic codeword among a plurality of said algebraic codewords A k ;
(b) determining the largest ratio among said calculated target ratios ;
and (c) extracting the index k corresponding to the largest calculated target ratio ;
wherein , because of the algebraic-code sparsity , the computation involved in said step of calculating a target ratio is reduced to the sum of only N and N(N+1)/2 terms for the numerator and denominator , respectively , namely ##EQU10## where : i=1 , 2 , . . . N ;
S(i) is the amplitude of the i th non-zero pulse of the algebraic codeword A k ;
D is a backward-filtered version of an L-sample block of said sound signal ;
p i is the position of the i th non-zero pulse of the algebraic codeword A k ;
p j is the position of the j th non-zero pulse of the algebraic codeword A k ;
and U is a Toeplitz matrix of autocorrelation terms defined by the following equation : ##EQU11## where : m=1 , 2 , . . . L ;
and h(n) is the impulse response (impulse responses, impulse response, LP filter) of a transfer function H varying in time with parameters representative of spectral characteristics of said sound signal and taking into account long term prediction parameters characterizing a periodicity of said sound signal .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse response) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5699482A
CLAIM 1
. A method of calculating an index k for encoding a sound signal according to a Code-Excited Linear Prediction technique using a sparse algebraic code to generate an algebraic codeword in the form of an L-sample long waveform comprising a small number N of non-zero pulses each of which is assignable to different positions in the waveform to thereby enable composition of several of algebraic codewords A k , said index calculating method comprising the steps of : (a) calculating a target ratio (DA . sub . k . sup . T /α . sub . k) . sup . 2 for each algebraic codeword among a plurality of said algebraic codewords A k ;
(b) determining the largest ratio among said calculated target ratios ;
and (c) extracting the index k corresponding to the largest calculated target ratio ;
wherein , because of the algebraic-code sparsity , the computation involved in said step of calculating a target ratio is reduced to the sum of only N and N(N+1)/2 terms for the numerator and denominator , respectively , namely ##EQU10## where : i=1 , 2 , . . . N ;
S(i) is the amplitude of the i th non-zero pulse of the algebraic codeword A k ;
D is a backward-filtered version of an L-sample block of said sound signal ;
p i is the position of the i th non-zero pulse of the algebraic codeword A k ;
p j is the position of the j th non-zero pulse of the algebraic codeword A k ;
and U is a Toeplitz matrix of autocorrelation terms defined by the following equation : ##EQU11## where : m=1 , 2 , . . . L ;
and h(n) is the impulse response (impulse responses, impulse response, LP filter) of a transfer function H varying in time with parameters representative of spectral characteristics of said sound signal and taking into account long term prediction parameters characterizing a periodicity of said sound signal .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5699482A
CLAIM 1
. A method of calculating an index k for encoding a sound signal according to a Code-Excited Linear Prediction technique using a sparse algebraic code to generate an algebraic codeword in the form of an L-sample long waveform comprising a small number N of non-zero pulses each of which is assignable to different positions in the waveform to thereby enable composition of several of algebraic codewords A k , said index calculating method comprising the steps of : (a) calculating a target ratio (DA . sub . k . sup . T /α . sub . k) . sup . 2 for each algebraic codeword among a plurality of said algebraic codewords A k ;
(b) determining the largest ratio among said calculated target ratios ;
and (c) extracting the index k corresponding to the largest calculated target ratio ;
wherein , because of the algebraic-code sparsity , the computation involved in said step of calculating a target ratio is reduced to the sum of only N and N(N+1)/2 terms for the numerator and denominator , respectively , namely ##EQU10## where : i=1 , 2 , . . . N ;
S(i) is the amplitude of the i th non-zero pulse of the algebraic codeword A k ;
D is a backward-filtered version of an L-sample block of said sound signal ;
p i is the position of the i th non-zero pulse of the algebraic codeword A k ;
p j is the position of the j th non-zero pulse of the algebraic codeword A k ;
and U is a Toeplitz matrix of autocorrelation terms defined by the following equation : ##EQU11## where : m=1 , 2 , . . . L ;
and h(n) is the impulse response (impulse responses, impulse response, LP filter) of a transfer function H varying in time with parameters representative of spectral characteristics of said sound signal and taking into account long term prediction parameters characterizing a periodicity of said sound signal .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (impulse response) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse response) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5699482A
CLAIM 1
. A method of calculating an index k for encoding a sound signal according to a Code-Excited Linear Prediction technique using a sparse algebraic code to generate an algebraic codeword in the form of an L-sample long waveform comprising a small number N of non-zero pulses each of which is assignable to different positions in the waveform to thereby enable composition of several of algebraic codewords A k , said index calculating method comprising the steps of : (a) calculating a target ratio (DA . sub . k . sup . T /α . sub . k) . sup . 2 for each algebraic codeword among a plurality of said algebraic codewords A k ;
(b) determining the largest ratio among said calculated target ratios ;
and (c) extracting the index k corresponding to the largest calculated target ratio ;
wherein , because of the algebraic-code sparsity , the computation involved in said step of calculating a target ratio is reduced to the sum of only N and N(N+1)/2 terms for the numerator and denominator , respectively , namely ##EQU10## where : i=1 , 2 , . . . N ;
S(i) is the amplitude of the i th non-zero pulse of the algebraic codeword A k ;
D is a backward-filtered version of an L-sample block of said sound signal ;
p i is the position of the i th non-zero pulse of the algebraic codeword A k ;
p j is the position of the j th non-zero pulse of the algebraic codeword A k ;
and U is a Toeplitz matrix of autocorrelation terms defined by the following equation : ##EQU11## where : m=1 , 2 , . . . L ;
and h(n) is the impulse response (impulse responses, impulse response, LP filter) of a transfer function H varying in time with parameters representative of spectral characteristics of said sound signal and taking into account long term prediction parameters characterizing a periodicity of said sound signal .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse response) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5699482A
CLAIM 1
. A method of calculating an index k for encoding a sound signal according to a Code-Excited Linear Prediction technique using a sparse algebraic code to generate an algebraic codeword in the form of an L-sample long waveform comprising a small number N of non-zero pulses each of which is assignable to different positions in the waveform to thereby enable composition of several of algebraic codewords A k , said index calculating method comprising the steps of : (a) calculating a target ratio (DA . sub . k . sup . T /α . sub . k) . sup . 2 for each algebraic codeword among a plurality of said algebraic codewords A k ;
(b) determining the largest ratio among said calculated target ratios ;
and (c) extracting the index k corresponding to the largest calculated target ratio ;
wherein , because of the algebraic-code sparsity , the computation involved in said step of calculating a target ratio is reduced to the sum of only N and N(N+1)/2 terms for the numerator and denominator , respectively , namely ##EQU10## where : i=1 , 2 , . . . N ;
S(i) is the amplitude of the i th non-zero pulse of the algebraic codeword A k ;
D is a backward-filtered version of an L-sample block of said sound signal ;
p i is the position of the i th non-zero pulse of the algebraic codeword A k ;
p j is the position of the j th non-zero pulse of the algebraic codeword A k ;
and U is a Toeplitz matrix of autocorrelation terms defined by the following equation : ##EQU11## where : m=1 , 2 , . . . L ;
and h(n) is the impulse response (impulse responses, impulse response, LP filter) of a transfer function H varying in time with parameters representative of spectral characteristics of said sound signal and taking into account long term prediction parameters characterizing a periodicity of said sound signal .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5701390A

Filed: 1995-02-22     Issued: 1997-12-23

Synthesis of MBE-based coded speech using regenerated phase information

(Original Assignee) Digital Voice Systems Inc     (Current Assignee) Digital Voice Systems Inc

Daniel W. Griffin, John C. Hardwick
US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5701390A
CLAIM 4
. The subject matter of claim 3 , wherein the spectral envelope information comprises i (LP filter) nformation representing spectral magnitudes at harmonic multiples frequency of the speech signal .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5701390A
CLAIM 4
. The subject matter of claim 3 , wherein the spectral envelope information comprises i (LP filter) nformation representing spectral magnitudes at harmonic multiples frequency of the speech signal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5701390A
CLAIM 4
. The subject matter of claim 3 , wherein the spectral envelope information comprises i (LP filter) nformation representing spectral magnitudes at harmonic multiples frequency of the speech signal .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5701390A
CLAIM 4
. The subject matter of claim 3 , wherein the spectral envelope information comprises i (LP filter) nformation representing spectral magnitudes at harmonic multiples frequency of the speech signal .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5701390A
CLAIM 4
. The subject matter of claim 3 , wherein the spectral envelope information comprises i (LP filter) nformation representing spectral magnitudes at harmonic multiples frequency of the speech signal .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5701390A
CLAIM 4
. The subject matter of claim 3 , wherein the spectral envelope information comprises i (LP filter) nformation representing spectral magnitudes at harmonic multiples frequency of the speech signal .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH08110799A

Filed: 1994-10-07     Issued: 1996-04-30

ベクトル量子化方法及びその復号化器

(Original Assignee) Nippon Telegr & Teleph Corp <Ntt>; 日本電信電話株式会社     

Jiyoutarou Ikedo, Akitoshi Kataoka, 丈太朗 池戸, 章俊 片岡
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (入力音) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音 (first impulse) 声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (の雑音) and the first non erased frame received after frame erasure is encoded as active speech .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音 (comfort noise) 信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (入力音) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音 (first impulse) 声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (の雑音) and the first non erased frame received after frame erasure is encoded as active speech .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音 (comfort noise) 信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JPH08110799A
CLAIM 7
【請求項7】 入力音声信号 (sound signal, speech signal) のスペクトル形状パラメー タを求め、このスペクトル形状パラメータを量子化し、 その量子化出力に応じて音声合成フィルタのフィルタ係 数を設定し、 ピッチ励振源から各種ピッチ周期をもつピッチ周期信号 の1つを選択し、その選択したピッチ周期信号に利得を 与え、 符号帳励振源から複数の雑音信号の1つを選択し、その 選択した雑音信号に利得を与え、 上記利得付与されたピッチ周期信号と、上記利得付与さ れた雑音信号とを加算し、その加算出力で上記音声合成 フィルタを駆動し、 上記各利得付与を符号帳から選択して行い、 上記音声合成フィルタの合成音声信号の上記入力音声信 号に対する歪が最小になるように、上記ピッチ励振源の 選択と上記符号帳励振源の選択と、上記符号帳の選択と を行うベクトル量子化方法において、 上記符号帳を複数設け、これら符号帳からそれぞれ選択 した利得ベクトルを加算し、その加算した合成ベクトル の対応する要素により上記ピッチ周期信号及び上記雑音 信号に対する利得付与をそれぞれ行い、 上記加算した合成ベクトルにはそれぞれその各要素が互 いに異なっている重み係数が乗算されたものであり、 かつ上記重み係数は上記符号帳ごとに互いに異なってい ることを特徴とするベクトル量子化方法。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5444816A

Filed: 1992-09-10     Issued: 1995-08-22

Dynamic codebook for efficient speech coding based on algebraic codes

(Original Assignee) Universite de Sherbrooke     (Current Assignee) Universite de Sherbrooke

Jean-Pierre Adoul, Claude Laflamme
US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (frequency characteristics) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5444816A
CLAIM 1
. A method of producing an excitation signal to be used by a sound signal synthesis means to synthesize a sound signal , comprising the steps of : generating a codeword signal in response to an index signal associated to said codeword signal , said signal generating step using an algebraic code to generate said codeword signal ;
and prefiltering the generated codeword signal to produce said excitation signal , said prefiltering step comprising processing the codeword signal through an adaptive prefilter having a transfer function varying in time in relation to parameters representative of spectral characteristics of said sound signal to thereby shape frequency characteristics (speech signal) of the excitation signal so as to damp frequencies perceptually annoying a human ear .

US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (frequency characteristics) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5444816A
CLAIM 1
. A method of producing an excitation signal to be used by a sound signal synthesis means to synthesize a sound signal , comprising the steps of : generating a codeword signal in response to an index signal associated to said codeword signal , said signal generating step using an algebraic code to generate said codeword signal ;
and prefiltering the generated codeword signal to produce said excitation signal , said prefiltering step comprising processing the codeword signal through an adaptive prefilter having a transfer function varying in time in relation to parameters representative of spectral characteristics of said sound signal to thereby shape frequency characteristics (speech signal) of the excitation signal so as to damp frequencies perceptually annoying a human ear .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (frequency characteristics) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5444816A
CLAIM 1
. A method of producing an excitation signal to be used by a sound signal synthesis means to synthesize a sound signal , comprising the steps of : generating a codeword signal in response to an index signal associated to said codeword signal , said signal generating step using an algebraic code to generate said codeword signal ;
and prefiltering the generated codeword signal to produce said excitation signal , said prefiltering step comprising processing the codeword signal through an adaptive prefilter having a transfer function varying in time in relation to parameters representative of spectral characteristics of said sound signal to thereby shape frequency characteristics (speech signal) of the excitation signal so as to damp frequencies perceptually annoying a human ear .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (frequency characteristics) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5444816A
CLAIM 1
. A method of producing an excitation signal to be used by a sound signal synthesis means to synthesize a sound signal , comprising the steps of : generating a codeword signal in response to an index signal associated to said codeword signal , said signal generating step using an algebraic code to generate said codeword signal ;
and prefiltering the generated codeword signal to produce said excitation signal , said prefiltering step comprising processing the codeword signal through an adaptive prefilter having a transfer function varying in time in relation to parameters representative of spectral characteristics of said sound signal to thereby shape frequency characteristics (speech signal) of the excitation signal so as to damp frequencies perceptually annoying a human ear .

US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (frequency characteristics) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5444816A
CLAIM 1
. A method of producing an excitation signal to be used by a sound signal synthesis means to synthesize a sound signal , comprising the steps of : generating a codeword signal in response to an index signal associated to said codeword signal , said signal generating step using an algebraic code to generate said codeword signal ;
and prefiltering the generated codeword signal to produce said excitation signal , said prefiltering step comprising processing the codeword signal through an adaptive prefilter having a transfer function varying in time in relation to parameters representative of spectral characteristics of said sound signal to thereby shape frequency characteristics (speech signal) of the excitation signal so as to damp frequencies perceptually annoying a human ear .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (frequency characteristics) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5444816A
CLAIM 1
. A method of producing an excitation signal to be used by a sound signal synthesis means to synthesize a sound signal , comprising the steps of : generating a codeword signal in response to an index signal associated to said codeword signal , said signal generating step using an algebraic code to generate said codeword signal ;
and prefiltering the generated codeword signal to produce said excitation signal , said prefiltering step comprising processing the codeword signal through an adaptive prefilter having a transfer function varying in time in relation to parameters representative of spectral characteristics of said sound signal to thereby shape frequency characteristics (speech signal) of the excitation signal so as to damp frequencies perceptually annoying a human ear .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (frequency characteristics) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5444816A
CLAIM 1
. A method of producing an excitation signal to be used by a sound signal synthesis means to synthesize a sound signal , comprising the steps of : generating a codeword signal in response to an index signal associated to said codeword signal , said signal generating step using an algebraic code to generate said codeword signal ;
and prefiltering the generated codeword signal to produce said excitation signal , said prefiltering step comprising processing the codeword signal through an adaptive prefilter having a transfer function varying in time in relation to parameters representative of spectral characteristics of said sound signal to thereby shape frequency characteristics (speech signal) of the excitation signal so as to damp frequencies perceptually annoying a human ear .

US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5444816A
CLAIM 11
. The method of claim 10 , wherein said codeword selecting step (signal classification parameter) comprises processing in an innermost loop of said embedded loops said calculated ratios to determine the largest ratio .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5122875A

Filed: 1991-02-27     Issued: 1992-06-16

An HDTV compression system

(Original Assignee) General Electric Co     (Current Assignee) General Electric Co

Dipankar Raychaudhuri, Joel W. Zdepski, Glenn A. Reitmeier, Charles M. Wine
US7693710B2
CLAIM 1
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (second data stream) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data stream (onset frame) s corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 12
. The apparatus set forth in claim 9 wherein said television video signal includes forward error correction (concealing frame erasure) codes and said first means includes means , responsive to said forward error correction codes , for performing error correction on said television signal .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 12
. The apparatus set forth in claim 9 wherein said television video signal includes forward error correction (concealing frame erasure) codes and said first means includes means , responsive to said forward error correction codes , for performing error correction on said television signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame is intraframe or interframe encoded .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5122875A
CLAIM 4
. The apparatus set forth in claim 2 wherein said first means includes means , responsive to a control signal (maximum amplitude) for adaptively controlling the volume of said compressed version of said video signals ;
and wherein said rate buffers include means for providing a signal indicating the relative fullness of said rate buffers ;
and means responsive to said signal indicating the relative fullness of said rate buffers for generating said control signal .

US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 12
. The apparatus set forth in claim 9 wherein said television video signal includes forward error correction (concealing frame erasure) codes and said first means includes means , responsive to said forward error correction codes , for performing error correction on said television signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame is intraframe or interframe encoded .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (respective frame) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (audio data) per sample for other frames .
US5122875A
CLAIM 1
. Apparatus for encoding a television signal comprising : a source of video signals ;
first means , coupled to said source , for providing a compressed version of said video signals , said compressed version including a first sequence of codewords of varying types defining said compressed video signal and a second sequence of codewords associated with said first sequence and indicating said types ;
second means , coupled to said first means , and responsive to said second sequence of codewords for variably parsing said first sequence of codewords into a high priority codeword sequence and a low priority codeword sequence according to the type of respective codewords of said first sequence ;
a source of audio signal occurring as a sequence of codewords ;
third means , coupled to said second means and said source of audio signal , for forming mutually exclusive transport blocks of said high priority codeword sequence , said low priority codeword sequence and said sequence of audio codewords , each transport block including a predetermined bit capacity occupied by codewords of one of high priority , low priority or audio data (average energy) , transport block header information for identifying said data , and error check bits generated over said data and said transport block header information , said third means providing a first transport block sequence including transport blocks of said high priority codewords , a second transport block sequence including transport blocks of said low priority codewords , and wherein transport blocks of said audio codewords are interleaved with transport blocks of at least one of said first and second transport block sequences ;
and means including means for modulating said first and second transport block sequences on separate carriers , and combining said separately modulated carriers .

US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 12
. The apparatus set forth in claim 9 wherein said television video signal includes forward error correction (concealing frame erasure) codes and said first means includes means , responsive to said forward error correction codes , for performing error correction on said television signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame is intraframe or interframe encoded .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 12
. The apparatus set forth in claim 9 wherein said television video signal includes forward error correction (concealing frame erasure) codes and said first means includes means , responsive to said forward error correction codes , for performing error correction on said television signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame is intraframe or interframe encoded .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (respective frame) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame is intraframe or interframe encoded .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (respective frame) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame is intraframe or interframe encoded .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 12
. The apparatus set forth in claim 9 wherein said television video signal includes forward error correction (concealing frame erasure) codes and said first means includes means , responsive to said forward error correction codes , for performing error correction on said television signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame is intraframe or interframe encoded .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frame, discrete cosine) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5122875A
CLAIM 7
. The apparatus set forth in claim 6 wherein said means for selectively providing frames of intra-frame compressed video data interpersed with frames of motion-compensated-predictive compressed data includes ;
discrete cosine (current frame, communication link, decoder determines concealment) transform means , coupled to said source of video signals for providing transform coefficients representing blocks of pixels ;
and quantizing means for adaptively limiting the dynamic range of said transform coefficients .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frames of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame (current frame, communication link, decoder determines concealment) is intraframe or interframe encoded .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 12
. The apparatus set forth in claim 9 wherein said television video signal includes forward error correction (concealing frame erasure) codes and said first means includes means , responsive to said forward error correction codes , for performing error correction on said television signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame is intraframe or interframe encoded .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5122875A
CLAIM 4
. The apparatus set forth in claim 2 wherein said first means includes means , responsive to a control signal (maximum amplitude) for adaptively controlling the volume of said compressed version of said video signals ;
and wherein said rate buffers include means for providing a signal indicating the relative fullness of said rate buffers ;
and means responsive to said signal indicating the relative fullness of said rate buffers for generating said control signal .

US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 12
. The apparatus set forth in claim 9 wherein said television video signal includes forward error correction (concealing frame erasure) codes and said first means includes means , responsive to said forward error correction codes , for performing error correction on said television signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame is intraframe or interframe encoded .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frame, discrete cosine) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5122875A
CLAIM 7
. The apparatus set forth in claim 6 wherein said means for selectively providing frames of intra-frame compressed video data interpersed with frames of motion-compensated-predictive compressed data includes ;
discrete cosine (current frame, communication link, decoder determines concealment) transform means , coupled to said source of video signals for providing transform coefficients representing blocks of pixels ;
and quantizing means for adaptively limiting the dynamic range of said transform coefficients .

US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame (current frame, communication link, decoder determines concealment) is intraframe or interframe encoded .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link (current frame, discrete cosine) for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs (video signal data) , when at least one onset frame (second data stream) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5122875A
CLAIM 7
. The apparatus set forth in claim 6 wherein said means for selectively providing frames of intra-frame compressed video data interpersed with frames of motion-compensated-predictive compressed data includes ;
discrete cosine (current frame, communication link, decoder determines concealment) transform means , coupled to said source of video signals for providing transform coefficients representing blocks of pixels ;
and quantizing means for adaptively limiting the dynamic range of said transform coefficients .

US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data stream (onset frame) s corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 15
. Apparatus for encoding a television signal representing images , comprising : a source of a sequence of codewords representing compressed video signal ;
first means , coupled to said source and responsive to said codewords for parsing , as a function of the amount of video signal data (decoder constructs) representing respective predetermined image areas , said sequence of codewords into a high priority codeword sequence and a low priority codeword sequence according to the relative importance of respective codewords for image . reproduction , and providing indicia for reconstructing said high and low priority sequences into a single sequence ;
second means , coupled to said first means , for forming mutually exclusive transport blocks of said high priority codeword sequence , and said low priority codeword sequence , each transport block including a predetermined bit capacity occupied by codewords of one high priority and low priority data , transport block header information , including said indicia , for identifying said data , and error check bits generated over said data and said transport block header information , said second means providing a first transport block sequence including transport blocks of said high priority codewords , and a second transport block sequence including transport blocks of said low priority codewords ;
forward error check means , for developing error correction data corresponding to mutually exclusive portions of said first transport block sequence and said second transport block sequence and appending the corresponding error correction data to the respective first transport block sequence and second transport block sequence .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frames of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame (current frame, communication link, decoder determines concealment) is intraframe or interframe encoded .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) and a phase information parameter related to the sound signal ;

and a communication link (current frame, discrete cosine) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5122875A
CLAIM 7
. The apparatus set forth in claim 6 wherein said means for selectively providing frames of intra-frame compressed video data interpersed with frames of motion-compensated-predictive compressed data includes ;
discrete cosine (current frame, communication link, decoder determines concealment) transform means , coupled to said source of video signals for providing transform coefficients representing blocks of pixels ;
and quantizing means for adaptively limiting the dynamic range of said transform coefficients .

US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame (current frame, communication link, decoder determines concealment) is intraframe or interframe encoded .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) and a phase information parameter related to the sound signal ;

and a communication link (current frame, discrete cosine) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5122875A
CLAIM 4
. The apparatus set forth in claim 2 wherein said first means includes means , responsive to a control signal (maximum amplitude) for adaptively controlling the volume of said compressed version of said video signals ;
and wherein said rate buffers include means for providing a signal indicating the relative fullness of said rate buffers ;
and means responsive to said signal indicating the relative fullness of said rate buffers for generating said control signal .

US5122875A
CLAIM 7
. The apparatus set forth in claim 6 wherein said means for selectively providing frames of intra-frame compressed video data interpersed with frames of motion-compensated-predictive compressed data includes ;
discrete cosine (current frame, communication link, decoder determines concealment) transform means , coupled to said source of video signals for providing transform coefficients representing blocks of pixels ;
and quantizing means for adaptively limiting the dynamic range of said transform coefficients .

US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame (current frame, communication link, decoder determines concealment) is intraframe or interframe encoded .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) and a phase information parameter related to the sound signal ;

and a communication link (current frame, discrete cosine) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (respective frame) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (audio data) per sample for other frames .
US5122875A
CLAIM 1
. Apparatus for encoding a television signal comprising : a source of video signals ;
first means , coupled to said source , for providing a compressed version of said video signals , said compressed version including a first sequence of codewords of varying types defining said compressed video signal and a second sequence of codewords associated with said first sequence and indicating said types ;
second means , coupled to said first means , and responsive to said second sequence of codewords for variably parsing said first sequence of codewords into a high priority codeword sequence and a low priority codeword sequence according to the type of respective codewords of said first sequence ;
a source of audio signal occurring as a sequence of codewords ;
third means , coupled to said second means and said source of audio signal , for forming mutually exclusive transport blocks of said high priority codeword sequence , said low priority codeword sequence and said sequence of audio codewords , each transport block including a predetermined bit capacity occupied by codewords of one of high priority , low priority or audio data (average energy) , transport block header information for identifying said data , and error check bits generated over said data and said transport block header information , said third means providing a first transport block sequence including transport blocks of said high priority codewords , a second transport block sequence including transport blocks of said low priority codewords , and wherein transport blocks of said audio codewords are interleaved with transport blocks of at least one of said first and second transport block sequences ;
and means including means for modulating said first and second transport block sequences on separate carriers , and combining said separately modulated carriers .

US5122875A
CLAIM 7
. The apparatus set forth in claim 6 wherein said means for selectively providing frames of intra-frame compressed video data interpersed with frames of motion-compensated-predictive compressed data includes ;
discrete cosine (current frame, communication link, decoder determines concealment) transform means , coupled to said source of video signals for providing transform coefficients representing blocks of pixels ;
and quantizing means for adaptively limiting the dynamic range of said transform coefficients .

US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame (current frame, communication link, decoder determines concealment) is intraframe or interframe encoded .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) and a phase information parameter related to the sound signal ;

and a communication link (current frame, discrete cosine) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5122875A
CLAIM 7
. The apparatus set forth in claim 6 wherein said means for selectively providing frames of intra-frame compressed video data interpersed with frames of motion-compensated-predictive compressed data includes ;
discrete cosine (current frame, communication link, decoder determines concealment) transform means , coupled to said source of video signals for providing transform coefficients representing blocks of pixels ;
and quantizing means for adaptively limiting the dynamic range of said transform coefficients .

US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame (current frame, communication link, decoder determines concealment) is intraframe or interframe encoded .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (respective frame) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame is intraframe or interframe encoded .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (respective frame) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame is intraframe or interframe encoded .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) and a phase information parameter related to the sound signal ;

and a communication link (current frame, discrete cosine) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5122875A
CLAIM 7
. The apparatus set forth in claim 6 wherein said means for selectively providing frames of intra-frame compressed video data interpersed with frames of motion-compensated-predictive compressed data includes ;
discrete cosine (current frame, communication link, decoder determines concealment) transform means , coupled to said source of video signals for providing transform coefficients representing blocks of pixels ;
and quantizing means for adaptively limiting the dynamic range of said transform coefficients .

US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame (current frame, communication link, decoder determines concealment) is intraframe or interframe encoded .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frame, discrete cosine) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5122875A
CLAIM 7
. The apparatus set forth in claim 6 wherein said means for selectively providing frames of intra-frame compressed video data interpersed with frames of motion-compensated-predictive compressed data includes ;
discrete cosine (current frame, communication link, decoder determines concealment) transform means , coupled to said source of video signals for providing transform coefficients representing blocks of pixels ;
and quantizing means for adaptively limiting the dynamic range of said transform coefficients .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frames of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame (current frame, communication link, decoder determines concealment) is intraframe or interframe encoded .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) and a phase information parameter related to the sound signal ;

and a communication link (current frame, discrete cosine) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5122875A
CLAIM 7
. The apparatus set forth in claim 6 wherein said means for selectively providing frames of intra-frame compressed video data interpersed with frames of motion-compensated-predictive compressed data includes ;
discrete cosine (current frame, communication link, decoder determines concealment) transform means , coupled to said source of video signals for providing transform coefficients representing blocks of pixels ;
and quantizing means for adaptively limiting the dynamic range of said transform coefficients .

US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame (current frame, communication link, decoder determines concealment) is intraframe or interframe encoded .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) and a phase information parameter related to the sound signal ;

and a communication link (current frame, discrete cosine) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5122875A
CLAIM 4
. The apparatus set forth in claim 2 wherein said first means includes means , responsive to a control signal (maximum amplitude) for adaptively controlling the volume of said compressed version of said video signals ;
and wherein said rate buffers include means for providing a signal indicating the relative fullness of said rate buffers ;
and means responsive to said signal indicating the relative fullness of said rate buffers for generating said control signal .

US5122875A
CLAIM 7
. The apparatus set forth in claim 6 wherein said means for selectively providing frames of intra-frame compressed video data interpersed with frames of motion-compensated-predictive compressed data includes ;
discrete cosine (current frame, communication link, decoder determines concealment) transform means , coupled to said source of video signals for providing transform coefficients representing blocks of pixels ;
and quantizing means for adaptively limiting the dynamic range of said transform coefficients .

US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame (current frame, communication link, decoder determines concealment) is intraframe or interframe encoded .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) and a phase information parameter related to the sound signal ;

and a communication link (current frame, discrete cosine) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (respective frame) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (audio data) per sample for other frames .
US5122875A
CLAIM 1
. Apparatus for encoding a television signal comprising : a source of video signals ;
first means , coupled to said source , for providing a compressed version of said video signals , said compressed version including a first sequence of codewords of varying types defining said compressed video signal and a second sequence of codewords associated with said first sequence and indicating said types ;
second means , coupled to said first means , and responsive to said second sequence of codewords for variably parsing said first sequence of codewords into a high priority codeword sequence and a low priority codeword sequence according to the type of respective codewords of said first sequence ;
a source of audio signal occurring as a sequence of codewords ;
third means , coupled to said second means and said source of audio signal , for forming mutually exclusive transport blocks of said high priority codeword sequence , said low priority codeword sequence and said sequence of audio codewords , each transport block including a predetermined bit capacity occupied by codewords of one of high priority , low priority or audio data (average energy) , transport block header information for identifying said data , and error check bits generated over said data and said transport block header information , said third means providing a first transport block sequence including transport blocks of said high priority codewords , a second transport block sequence including transport blocks of said low priority codewords , and wherein transport blocks of said audio codewords are interleaved with transport blocks of at least one of said first and second transport block sequences ;
and means including means for modulating said first and second transport block sequences on separate carriers , and combining said separately modulated carriers .

US5122875A
CLAIM 7
. The apparatus set forth in claim 6 wherein said means for selectively providing frames of intra-frame compressed video data interpersed with frames of motion-compensated-predictive compressed data includes ;
discrete cosine (current frame, communication link, decoder determines concealment) transform means , coupled to said source of video signals for providing transform coefficients representing blocks of pixels ;
and quantizing means for adaptively limiting the dynamic range of said transform coefficients .

US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame (current frame, communication link, decoder determines concealment) is intraframe or interframe encoded .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (respective frame) , an energy information parameter (image area) and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frame, discrete cosine) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5122875A
CLAIM 7
. The apparatus set forth in claim 6 wherein said means for selectively providing frames of intra-frame compressed video data interpersed with frames of motion-compensated-predictive compressed data includes ;
discrete cosine (current frame, communication link, decoder determines concealment) transform means , coupled to said source of video signals for providing transform coefficients representing blocks of pixels ;
and quantizing means for adaptively limiting the dynamic range of said transform coefficients .

US5122875A
CLAIM 9
. In a receiver for receiving a television signal of the type including compressed video data variably parsed on an image area (energy information parameter) by image area basis into high and low priority channels , the data in said high and low priority channels occurring in transport blocks of predetermined data capacity , said transport blocks including transport header information having control data related to said variable parsing , signal data , and error check data related to the transport header information and signal data contained in respective blocks , the signal data in each transport block corresponding to an exclusive type of data (e . g . , to high priorty video data , or low priority video data) ;
apparatus comprising : first means for receiving said television signal and providing first and second data streams corresponding to transport blocks from said high and low priority channels respectively ;
second means , coupled to said first means , for providing first and second sequences of codewords corresponding to high priority video dat and low priority video data respectively with said transport block header information excised therefrom , and providing a further sequence of codewords corresponding to said transport block header information ;
third means , coupled to said second means , and responsive to said transport block header information , including said control data , for combining said first and second sequences of codewords into a further sequence of codewords ;
and fourth means , coupled to said third means , for decompressing said further sequence of codewords representing compressed video data to produce a noncompressed video signal .

US5122875A
CLAIM 18
. The apparatus set forth in claim 15 wherein respective frame (signal classification parameter, speech signal) s of said compressed video signal are compressed according to intraframe or interframe coding methods , and said first means parses said sequence of codewords as a function of the amount of video signal data representing respective predetermined image areas , and as a function of whether the current frame (current frame, communication link, decoder determines concealment) is intraframe or interframe encoded .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US4707857A

Filed: 1984-08-27     Issued: 1987-11-17

Voice command recognition system having compact significant feature data

(Original Assignee) John Marley; Kurt Marley     

John Marley, Kurt Marley
US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (said system) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US4707857A
CLAIM 11
. A system for recognizing an utterance having positive pressure wave portions and negative pressure wave portions , said system (maximum amplitude) comprising : (a) means for producing an analog signal representative of said utterance ;
(b) means responsive to said analog signal for producing a binary signal having a " ;
1" ;
level during positive pressure wave portions of said speech and a " ;
0" ;
level during negative pressure wave portions of said utterance ;
(c) means responsive to said binary signal for producing a plurality of first numbers tnat represent the durations of said " ;
1" ;
levels of said binary signal , and producing a plurality of second numbers that represent the durations of said " ;
0" ;
levels of said binary signal ;
(d) means for computing first and second running averages of said first numbers and said second numbers , respectively ;
(e) means for sampling said running averages at predetermined intervals to produce sampled data ;
(f) means responsive to said sampled data for identifying and producing event data for significant events of said utterance , including rapidly rising and rapidly falling values of said first running average , silence values of said second running average , and the durations of said significant events ;
(g) means for comparing event data for each significant event of sucessive groups of consecutive events to previously stored reference event data corresponding to various reference events , and identifying any resulting matching , said comparing of event data of each successive group after the initial comparing beginning immediately after any prior matching , that comparing being performed with stored reference event data corresponding to a next reference event ;
and (h) means for recognizing said utterance if there are fewer than a predetermined number of mismatches between event data of significant events of said utterance and said stored reference event data , which predetermined number is determined in accordance with a predetermined criteria .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US4707857A
CLAIM 1
. A method of speech recognition for recognizing an utterance having positive pressure wave portions and negative pressure wave portions , said method comprising the steps of : (a) producing an analog signal representative of said speech ;
(b) producing a binary signal having a " ;
1" ;
level during positive pressure wave portions of said speech and a " ;
0" ;
level during negative pressure wave portions of said speech ;
(c) producing a plurality of first numbers that represent the durations of said " ;
1" ;
levels of said binary signal , and producing a plurality of second numbers that represent the durations of said " ;
0" ;
levels of said binary signal ;
(d) computing first and second running averages of said first numbers and said second numbers , respectively ;
(e) sampling said running averages at predetermined intervals ;
(f) operating on the sampled data to identify and produce event data for significant events of said utterance , including rapidly rising and rapidly falling values of said first running average , silence values of said second running average , and the durations of said significant events ;
(g) comparing event data for each significant event of a first group of consecutive events of said utterance to first previously stored reference event data , and identifying any resulting matching ;
(h) comparing event data for each significant event of a second group of consecutive events of said utterance beginning immediately after any such matching to second previously stored reference event data and identifying any matching ;
(i) repeating steps (g) and (h) for successive groups of said significant events ;
and (j) recognizing said utterance if there are fewer than a predetermined number (last non) of mismatches , which predetermined number is determined in accordance with a predetermined criteria .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (analog signal) , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US4707857A
CLAIM 1
. A method of speech recognition for recognizing an utterance having positive pressure wave portions and negative pressure wave portions , said method comprising the steps of : (a) producing an analog signal (current frame) representative of said speech ;
(b) producing a binary signal having a " ;
1" ;
level during positive pressure wave portions of said speech and a " ;
0" ;
level during negative pressure wave portions of said speech ;
(c) producing a plurality of first numbers that represent the durations of said " ;
1" ;
levels of said binary signal , and producing a plurality of second numbers that represent the durations of said " ;
0" ;
levels of said binary signal ;
(d) computing first and second running averages of said first numbers and said second numbers , respectively ;
(e) sampling said running averages at predetermined intervals ;
(f) operating on the sampled data to identify and produce event data for significant events of said utterance , including rapidly rising and rapidly falling values of said first running average , silence values of said second running average , and the durations of said significant events ;
(g) comparing event data for each significant event of a first group of consecutive events of said utterance to first previously stored reference event data , and identifying any resulting matching ;
(h) comparing event data for each significant event of a second group of consecutive events of said utterance beginning immediately after any such matching to second previously stored reference event data and identifying any matching ;
(i) repeating steps (g) and (h) for successive groups of said significant events ;
and (j) recognizing said utterance if there are fewer than a predetermined number (last non) of mismatches , which predetermined number is determined in accordance with a predetermined criteria .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (said system) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US4707857A
CLAIM 11
. A system for recognizing an utterance having positive pressure wave portions and negative pressure wave portions , said system (maximum amplitude) comprising : (a) means for producing an analog signal representative of said utterance ;
(b) means responsive to said analog signal for producing a binary signal having a " ;
1" ;
level during positive pressure wave portions of said speech and a " ;
0" ;
level during negative pressure wave portions of said utterance ;
(c) means responsive to said binary signal for producing a plurality of first numbers tnat represent the durations of said " ;
1" ;
levels of said binary signal , and producing a plurality of second numbers that represent the durations of said " ;
0" ;
levels of said binary signal ;
(d) means for computing first and second running averages of said first numbers and said second numbers , respectively ;
(e) means for sampling said running averages at predetermined intervals to produce sampled data ;
(f) means responsive to said sampled data for identifying and producing event data for significant events of said utterance , including rapidly rising and rapidly falling values of said first running average , silence values of said second running average , and the durations of said significant events ;
(g) means for comparing event data for each significant event of sucessive groups of consecutive events to previously stored reference event data corresponding to various reference events , and identifying any resulting matching , said comparing of event data of each successive group after the initial comparing beginning immediately after any prior matching , that comparing being performed with stored reference event data corresponding to a next reference event ;
and (h) means for recognizing said utterance if there are fewer than a predetermined number of mismatches between event data of significant events of said utterance and said stored reference event data , which predetermined number is determined in accordance with a predetermined criteria .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (analog signal) , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US4707857A
CLAIM 1
. A method of speech recognition for recognizing an utterance having positive pressure wave portions and negative pressure wave portions , said method comprising the steps of : (a) producing an analog signal (current frame) representative of said speech ;
(b) producing a binary signal having a " ;
1" ;
level during positive pressure wave portions of said speech and a " ;
0" ;
level during negative pressure wave portions of said speech ;
(c) producing a plurality of first numbers that represent the durations of said " ;
1" ;
levels of said binary signal , and producing a plurality of second numbers that represent the durations of said " ;
0" ;
levels of said binary signal ;
(d) computing first and second running averages of said first numbers and said second numbers , respectively ;
(e) sampling said running averages at predetermined intervals ;
(f) operating on the sampled data to identify and produce event data for significant events of said utterance , including rapidly rising and rapidly falling values of said first running average , silence values of said second running average , and the durations of said significant events ;
(g) comparing event data for each significant event of a first group of consecutive events of said utterance to first previously stored reference event data , and identifying any resulting matching ;
(h) comparing event data for each significant event of a second group of consecutive events of said utterance beginning immediately after any such matching to second previously stored reference event data and identifying any matching ;
(i) repeating steps (g) and (h) for successive groups of said significant events ;
and (j) recognizing said utterance if there are fewer than a predetermined number (last non) of mismatches , which predetermined number is determined in accordance with a predetermined criteria .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (said system) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US4707857A
CLAIM 11
. A system for recognizing an utterance having positive pressure wave portions and negative pressure wave portions , said system (maximum amplitude) comprising : (a) means for producing an analog signal representative of said utterance ;
(b) means responsive to said analog signal for producing a binary signal having a " ;
1" ;
level during positive pressure wave portions of said speech and a " ;
0" ;
level during negative pressure wave portions of said utterance ;
(c) means responsive to said binary signal for producing a plurality of first numbers tnat represent the durations of said " ;
1" ;
levels of said binary signal , and producing a plurality of second numbers that represent the durations of said " ;
0" ;
levels of said binary signal ;
(d) means for computing first and second running averages of said first numbers and said second numbers , respectively ;
(e) means for sampling said running averages at predetermined intervals to produce sampled data ;
(f) means responsive to said sampled data for identifying and producing event data for significant events of said utterance , including rapidly rising and rapidly falling values of said first running average , silence values of said second running average , and the durations of said significant events ;
(g) means for comparing event data for each significant event of sucessive groups of consecutive events to previously stored reference event data corresponding to various reference events , and identifying any resulting matching , said comparing of event data of each successive group after the initial comparing beginning immediately after any prior matching , that comparing being performed with stored reference event data corresponding to a next reference event ;
and (h) means for recognizing said utterance if there are fewer than a predetermined number of mismatches between event data of significant events of said utterance and said stored reference event data , which predetermined number is determined in accordance with a predetermined criteria .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US4707857A
CLAIM 1
. A method of speech recognition for recognizing an utterance having positive pressure wave portions and negative pressure wave portions , said method comprising the steps of : (a) producing an analog signal representative of said speech ;
(b) producing a binary signal having a " ;
1" ;
level during positive pressure wave portions of said speech and a " ;
0" ;
level during negative pressure wave portions of said speech ;
(c) producing a plurality of first numbers that represent the durations of said " ;
1" ;
levels of said binary signal , and producing a plurality of second numbers that represent the durations of said " ;
0" ;
levels of said binary signal ;
(d) computing first and second running averages of said first numbers and said second numbers , respectively ;
(e) sampling said running averages at predetermined intervals ;
(f) operating on the sampled data to identify and produce event data for significant events of said utterance , including rapidly rising and rapidly falling values of said first running average , silence values of said second running average , and the durations of said significant events ;
(g) comparing event data for each significant event of a first group of consecutive events of said utterance to first previously stored reference event data , and identifying any resulting matching ;
(h) comparing event data for each significant event of a second group of consecutive events of said utterance beginning immediately after any such matching to second previously stored reference event data and identifying any matching ;
(i) repeating steps (g) and (h) for successive groups of said significant events ;
and (j) recognizing said utterance if there are fewer than a predetermined number (last non) of mismatches , which predetermined number is determined in accordance with a predetermined criteria .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (analog signal) , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US4707857A
CLAIM 1
. A method of speech recognition for recognizing an utterance having positive pressure wave portions and negative pressure wave portions , said method comprising the steps of : (a) producing an analog signal (current frame) representative of said speech ;
(b) producing a binary signal having a " ;
1" ;
level during positive pressure wave portions of said speech and a " ;
0" ;
level during negative pressure wave portions of said speech ;
(c) producing a plurality of first numbers that represent the durations of said " ;
1" ;
levels of said binary signal , and producing a plurality of second numbers that represent the durations of said " ;
0" ;
levels of said binary signal ;
(d) computing first and second running averages of said first numbers and said second numbers , respectively ;
(e) sampling said running averages at predetermined intervals ;
(f) operating on the sampled data to identify and produce event data for significant events of said utterance , including rapidly rising and rapidly falling values of said first running average , silence values of said second running average , and the durations of said significant events ;
(g) comparing event data for each significant event of a first group of consecutive events of said utterance to first previously stored reference event data , and identifying any resulting matching ;
(h) comparing event data for each significant event of a second group of consecutive events of said utterance beginning immediately after any such matching to second previously stored reference event data and identifying any matching ;
(i) repeating steps (g) and (h) for successive groups of said significant events ;
and (j) recognizing said utterance if there are fewer than a predetermined number (last non) of mismatches , which predetermined number is determined in accordance with a predetermined criteria .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (said system) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US4707857A
CLAIM 11
. A system for recognizing an utterance having positive pressure wave portions and negative pressure wave portions , said system (maximum amplitude) comprising : (a) means for producing an analog signal representative of said utterance ;
(b) means responsive to said analog signal for producing a binary signal having a " ;
1" ;
level during positive pressure wave portions of said speech and a " ;
0" ;
level during negative pressure wave portions of said utterance ;
(c) means responsive to said binary signal for producing a plurality of first numbers tnat represent the durations of said " ;
1" ;
levels of said binary signal , and producing a plurality of second numbers that represent the durations of said " ;
0" ;
levels of said binary signal ;
(d) means for computing first and second running averages of said first numbers and said second numbers , respectively ;
(e) means for sampling said running averages at predetermined intervals to produce sampled data ;
(f) means responsive to said sampled data for identifying and producing event data for significant events of said utterance , including rapidly rising and rapidly falling values of said first running average , silence values of said second running average , and the durations of said significant events ;
(g) means for comparing event data for each significant event of sucessive groups of consecutive events to previously stored reference event data corresponding to various reference events , and identifying any resulting matching , said comparing of event data of each successive group after the initial comparing beginning immediately after any prior matching , that comparing being performed with stored reference event data corresponding to a next reference event ;
and (h) means for recognizing said utterance if there are fewer than a predetermined number of mismatches between event data of significant events of said utterance and said stored reference event data , which predetermined number is determined in accordance with a predetermined criteria .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (analog signal) , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US4707857A
CLAIM 1
. A method of speech recognition for recognizing an utterance having positive pressure wave portions and negative pressure wave portions , said method comprising the steps of : (a) producing an analog signal (current frame) representative of said speech ;
(b) producing a binary signal having a " ;
1" ;
level during positive pressure wave portions of said speech and a " ;
0" ;
level during negative pressure wave portions of said speech ;
(c) producing a plurality of first numbers that represent the durations of said " ;
1" ;
levels of said binary signal , and producing a plurality of second numbers that represent the durations of said " ;
0" ;
levels of said binary signal ;
(d) computing first and second running averages of said first numbers and said second numbers , respectively ;
(e) sampling said running averages at predetermined intervals ;
(f) operating on the sampled data to identify and produce event data for significant events of said utterance , including rapidly rising and rapidly falling values of said first running average , silence values of said second running average , and the durations of said significant events ;
(g) comparing event data for each significant event of a first group of consecutive events of said utterance to first previously stored reference event data , and identifying any resulting matching ;
(h) comparing event data for each significant event of a second group of consecutive events of said utterance beginning immediately after any such matching to second previously stored reference event data and identifying any matching ;
(i) repeating steps (g) and (h) for successive groups of said significant events ;
and (j) recognizing said utterance if there are fewer than a predetermined number (last non) of mismatches , which predetermined number is determined in accordance with a predetermined criteria .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JP2002100994A

Filed: 2001-07-16     Issued: 2002-04-05

媒体ストリームのスケーラブル符号化方法、スケーラブルエンコーダおよびマルチメディア端末

(Original Assignee) Nokia Mobile Phones Ltd; ノキア モービル フォーンズ リミティド     

Pasi Ojala, Teemu Parkkinen, パルッキネン テーム, オヤラ パシ
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (の少なくとも1) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

JP2002100994A
CLAIM 2
【請求項2】 前記第1符合化手段および第2符号化手 段の少なくとも1 (conducting frame erasure concealment) つが可変レート符号化手段であること を特徴とする請求項1に記載のスケーラブルエンコー ダ。

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (の少なくとも1) and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

JP2002100994A
CLAIM 2
【請求項2】 前記第1符合化手段および第2符号化手 段の少なくとも1 (conducting frame erasure concealment) つが可変レート符号化手段であること を特徴とする請求項1に記載のスケーラブルエンコー ダ。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (の少なくとも1) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

JP2002100994A
CLAIM 2
【請求項2】 前記第1符合化手段および第2符号化手 段の少なくとも1 (conducting frame erasure concealment) つが可変レート符号化手段であること を特徴とする請求項1に記載のスケーラブルエンコー ダ。

JP2002100994A
CLAIM 3
【請求項3】 前記第1符合化手段および第2符号化手 段のうちの1つによって生成される少なくとも前記デー タストリームに関する目標ビットレートを決定し、かつ 前記データストリームの前記ビットレートを調節するよ うに構成された手段(602、802)を前記制御手段 が有すること (maximum amplitude) を特徴とする請求項2に記載のスケーラブ ルエンコーダ。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (の少なくとも1) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

JP2002100994A
CLAIM 2
【請求項2】 前記第1符合化手段および第2符号化手 段の少なくとも1 (conducting frame erasure concealment) つが可変レート符号化手段であること を特徴とする請求項1に記載のスケーラブルエンコー ダ。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (の少なくとも1) and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

JP2002100994A
CLAIM 2
【請求項2】 前記第1符合化手段および第2符号化手 段の少なくとも1 (conducting frame erasure concealment) つが可変レート符号化手段であること を特徴とする請求項1に記載のスケーラブルエンコー ダ。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (102) is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment (の少なくとも1) and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

JP2002100994A
CLAIM 2
【請求項2】 前記第1符合化手段および第2符号化手 段の少なくとも1 (conducting frame erasure concealment) つが可変レート符号化手段であること を特徴とする請求項1に記載のスケーラブルエンコー ダ。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (102) is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (の少なくとも1) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

JP2002100994A
CLAIM 2
【請求項2】 前記第1符合化手段および第2符号化手 段の少なくとも1 (conducting frame erasure concealment) つが可変レート符号化手段であること を特徴とする請求項1に記載のスケーラブルエンコー ダ。

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (備えるマルチ, ワーク) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2002100994A
CLAIM 9
【請求項9】 前記第1符合化手段および第2符号化手 段の少なくとも1つが一組の利用可能な符号化アルゴリ ズムを備えるマルチ (current frame, replacement frame) レート符号化手段であることを特徴 とする請求項1に記載のスケーラブルエンコーダ。

JP2002100994A
CLAIM 31
【請求項31】 請求項24から30のいずれか一項に 記載のマルチメディア端末(1300)であって、移動 通信ネットワーク (current frame, replacement frame) の移動局であることを特徴とするマル チメディア端末。

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

JP2002100994A
CLAIM 3
【請求項3】 前記第1符合化手段および第2符号化手 段のうちの1つによって生成される少なくとも前記デー タストリームに関する目標ビットレートを決定し、かつ 前記データストリームの前記ビットレートを調節するよ うに構成された手段(602、802)を前記制御手段 が有すること (maximum amplitude) を特徴とする請求項2に記載のスケーラブ ルエンコーダ。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (102) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (備えるマルチ, ワーク) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment (の少なくとも1) and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (備えるマルチ, ワーク) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

JP2002100994A
CLAIM 2
【請求項2】 前記第1符合化手段および第2符号化手 段の少なくとも1 (conducting frame erasure concealment) つが可変レート符号化手段であること を特徴とする請求項1に記載のスケーラブルエンコー ダ。

JP2002100994A
CLAIM 9
【請求項9】 前記第1符合化手段および第2符号化手 段の少なくとも1つが一組の利用可能な符号化アルゴリ ズムを備えるマルチ (current frame, replacement frame) レート符号化手段であることを特徴 とする請求項1に記載のスケーラブルエンコーダ。

JP2002100994A
CLAIM 31
【請求項31】 請求項24から30のいずれか一項に 記載のマルチメディア端末(1300)であって、移動 通信ネットワーク (current frame, replacement frame) の移動局であることを特徴とするマル チメディア端末。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment (の少なくとも1) and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

JP2002100994A
CLAIM 2
【請求項2】 前記第1符合化手段および第2符号化手 段の少なくとも1 (conducting frame erasure concealment) つが可変レート符号化手段であること を特徴とする請求項1に記載のスケーラブルエンコー ダ。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

JP2002100994A
CLAIM 3
【請求項3】 前記第1符合化手段および第2符号化手 段のうちの1つによって生成される少なくとも前記デー タストリームに関する目標ビットレートを決定し、かつ 前記データストリームの前記ビットレートを調節するよ うに構成された手段(602、802)を前記制御手段 が有すること (maximum amplitude) を特徴とする請求項2に記載のスケーラブ ルエンコーダ。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment (の少なくとも1) and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

JP2002100994A
CLAIM 2
【請求項2】 前記第1符合化手段および第2符号化手 段の少なくとも1 (conducting frame erasure concealment) つが可変レート符号化手段であること を特徴とする請求項1に記載のスケーラブルエンコー ダ。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (102) is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment (の少なくとも1) and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

JP2002100994A
CLAIM 2
【請求項2】 前記第1符合化手段および第2符号化手 段の少なくとも1 (conducting frame erasure concealment) つが可変レート符号化手段であること を特徴とする請求項1に記載のスケーラブルエンコー ダ。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (102) is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (備えるマルチ, ワーク) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2002100994A
CLAIM 9
【請求項9】 前記第1符合化手段および第2符号化手 段の少なくとも1つが一組の利用可能な符号化アルゴリ ズムを備えるマルチ (current frame, replacement frame) レート符号化手段であることを特徴 とする請求項1に記載のスケーラブルエンコーダ。

JP2002100994A
CLAIM 31
【請求項31】 請求項24から30のいずれか一項に 記載のマルチメディア端末(1300)であって、移動 通信ネットワーク (current frame, replacement frame) の移動局であることを特徴とするマル チメディア端末。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

JP2002100994A
CLAIM 3
【請求項3】 前記第1符合化手段および第2符号化手 段のうちの1つによって生成される少なくとも前記デー タストリームに関する目標ビットレートを決定し、かつ 前記データストリームの前記ビットレートを調節するよ うに構成された手段(602、802)を前記制御手段 が有すること (maximum amplitude) を特徴とする請求項2に記載のスケーラブ ルエンコーダ。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (102) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (102) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (備えるマルチ, ワーク) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment (の少なくとも1) and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (備えるマルチ, ワーク) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JP2002100994A
CLAIM 1
【請求項1】 媒体信号を符号化するスケーラブルエン コーダ(100)であって、 前記媒体信号(101)に関係するコアデータストリー ムであり、かつ第1ビットレートをもつ第1データスト リーム(102 (sound signal) )、を生成する第1符号化手段(21 0)と、 前記媒体信号に関係する一組のエンハンスメントデータ ストリームを備え、かつ第2ビットレートをもつ第2デ ータストリーム(103)、を生成する第2符号化手段 (230)と、 少なくとも前記第1データストリームと前記第2データ ストリームとを第3データストリーム(104)に組み 合わせるマルチプレクサ(110)とを備えるスケーラ ブルエンコーダにおいて、 制御情報(401)を受信し、前記制御情報に従って前 記第3データストリームにおける前記第1データストリ ームおよび前記第2データストリームの目標組合わせを 決定し、かつ前記第1ビットレートおよび前記第2ビッ トレートに影響を及ぼすことによって前記第3データス トリームにおける前記第1データストリームおよび前記 第2データストリームの組合わせを調節するように構成 された制御手段(420、421、422)をさらに備 えることを特徴とするスケーラブルエンコーダ。

JP2002100994A
CLAIM 2
【請求項2】 前記第1符合化手段および第2符号化手 段の少なくとも1 (conducting frame erasure concealment) つが可変レート符号化手段であること を特徴とする請求項1に記載のスケーラブルエンコー ダ。

JP2002100994A
CLAIM 9
【請求項9】 前記第1符合化手段および第2符号化手 段の少なくとも1つが一組の利用可能な符号化アルゴリ ズムを備えるマルチ (current frame, replacement frame) レート符号化手段であることを特徴 とする請求項1に記載のスケーラブルエンコーダ。

JP2002100994A
CLAIM 31
【請求項31】 請求項24から30のいずれか一項に 記載のマルチメディア端末(1300)であって、移動 通信ネットワーク (current frame, replacement frame) の移動局であることを特徴とするマル チメディア端末。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
CN1344067A

Filed: 2001-07-09     Issued: 2002-04-10

采用不同编码原理的传送系统

(Original Assignee) 皇家菲利浦电子有限公司     

F·伍帕曼
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (至少其中一个) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
CN1344067A
CLAIM 2
. 一解码系统,用于从第一和第二编码信号中获取重组信号,该系统包括第一和第二解码器,其中至少其中一个 (first impulse) 解码器是频率域解码器,以及解码系统包含信号组合装置,用于将从频率域解码器获得的解码信号与从另一个解码器获得的解码信号组合成重组信号,其特征在于,第一解码器是频率域解码器,第二解码器是时间域解码器。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (从第一) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
CN1344067A
CLAIM 1
. 一接收器,用于从第一 (LP filter excitation signal) 和第二解码信号中获取一个重组信号,该接收器包括第一和第二解码器,其中该接收器包括信号组合装置,用于将频率域解码器中获取的解码信号与另一个解码器获取的解码信号组合成一个重组信号,其特征在于,第一解码器是频率域解码器,第二解码器是时间域解码器。

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (从第一) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
CN1344067A
CLAIM 1
. 一接收器,用于从第一 (LP filter excitation signal) 和第二解码信号中获取一个重组信号,该接收器包括第一和第二解码器,其中该接收器包括信号组合装置,用于将频率域解码器中获取的解码信号与另一个解码器获取的解码信号组合成一个重组信号,其特征在于,第一解码器是频率域解码器,第二解码器是时间域解码器。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (从第一) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
CN1344067A
CLAIM 1
. 一接收器,用于从第一 (LP filter excitation signal) 和第二解码信号中获取一个重组信号,该接收器包括第一和第二解码器,其中该接收器包括信号组合装置,用于将频率域解码器中获取的解码信号与另一个解码器获取的解码信号组合成一个重组信号,其特征在于,第一解码器是频率域解码器,第二解码器是时间域解码器。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (至少其中一个) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
CN1344067A
CLAIM 2
. 一解码系统,用于从第一和第二编码信号中获取重组信号,该系统包括第一和第二解码器,其中至少其中一个 (first impulse) 解码器是频率域解码器,以及解码系统包含信号组合装置,用于将从频率域解码器获得的解码信号与从另一个解码器获得的解码信号组合成重组信号,其特征在于,第一解码器是频率域解码器,第二解码器是时间域解码器。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (从第一) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
CN1344067A
CLAIM 1
. 一接收器,用于从第一 (LP filter excitation signal) 和第二解码信号中获取一个重组信号,该接收器包括第一和第二解码器,其中该接收器包括信号组合装置,用于将频率域解码器中获取的解码信号与另一个解码器获取的解码信号组合成一个重组信号,其特征在于,第一解码器是频率域解码器,第二解码器是时间域解码器。

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (从第一) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
CN1344067A
CLAIM 1
. 一接收器,用于从第一 (LP filter excitation signal) 和第二解码信号中获取一个重组信号,该接收器包括第一和第二解码器,其中该接收器包括信号组合装置,用于将频率域解码器中获取的解码信号与另一个解码器获取的解码信号组合成一个重组信号,其特征在于,第一解码器是频率域解码器,第二解码器是时间域解码器。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (从第一) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
CN1344067A
CLAIM 1
. 一接收器,用于从第一 (LP filter excitation signal) 和第二解码信号中获取一个重组信号,该接收器包括第一和第二解码器,其中该接收器包括信号组合装置,用于将频率域解码器中获取的解码信号与另一个解码器获取的解码信号组合成一个重组信号,其特征在于,第一解码器是频率域解码器,第二解码器是时间域解码器。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US20010023396A1

Filed: 2001-02-05     Issued: 2001-09-20

Method and apparatus for hybrid coding of speech at 4kbps

(Original Assignee) Allen Gersho; Eyal Shlomot; Vladimir Cuperman; Chunyan Li     

Allen Gersho, Eyal Shlomot, Vladimir Cuperman, Chunyan Li
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (successive frames) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US20010023396A1
CLAIM 5
. A method as recited in claim 4 , further comprising the step of time aligning the reproduced speech across the boundary between two successive frames (onset frame) of speech where one frame of speech is waveform coded and the other frame of speech is harmonically coded .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US20010023396A1
CLAIM 13
. A hybrid speech encoder (last frame, replacement frame) , comprising : (a) means for classifying frames of speech signals as voiced , unvoiced , or transitory ;
(b) means for harmonically coding frames associated with at least one of said classes ;
and (c) means for coding frames classified as transitory using a coding technique selected from the group consisting of waveform coding , analysis-by-synthesis coding , codebook excited linear prediction analysis-by-synthesis coding , and multipulse analysis-by-synthesis coding .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US20010023396A1
CLAIM 13
. A hybrid speech encoder (last frame, replacement frame) , comprising : (a) means for classifying frames of speech signals as voiced , unvoiced , or transitory ;
(b) means for harmonically coding frames associated with at least one of said classes ;
and (c) means for coding frames classified as transitory using a coding technique selected from the group consisting of waveform coding , analysis-by-synthesis coding , codebook excited linear prediction analysis-by-synthesis coding , and multipulse analysis-by-synthesis coding .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US20010023396A1
CLAIM 13
. A hybrid speech encoder (last frame, replacement frame) , comprising : (a) means for classifying frames of speech signals as voiced , unvoiced , or transitory ;
(b) means for harmonically coding frames associated with at least one of said classes ;
and (c) means for coding frames classified as transitory using a coding technique selected from the group consisting of waveform coding , analysis-by-synthesis coding , codebook excited linear prediction analysis-by-synthesis coding , and multipulse analysis-by-synthesis coding .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (successive frames) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US20010023396A1
CLAIM 5
. A method as recited in claim 4 , further comprising the step of time aligning the reproduced speech across the boundary between two successive frames (onset frame) of speech where one frame of speech is waveform coded and the other frame of speech is harmonically coded .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US20010023396A1
CLAIM 13
. A hybrid speech encoder (last frame, replacement frame) , comprising : (a) means for classifying frames of speech signals as voiced , unvoiced , or transitory ;
(b) means for harmonically coding frames associated with at least one of said classes ;
and (c) means for coding frames classified as transitory using a coding technique selected from the group consisting of waveform coding , analysis-by-synthesis coding , codebook excited linear prediction analysis-by-synthesis coding , and multipulse analysis-by-synthesis coding .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US20010023396A1
CLAIM 13
. A hybrid speech encoder (last frame, replacement frame) , comprising : (a) means for classifying frames of speech signals as voiced , unvoiced , or transitory ;
(b) means for harmonically coding frames associated with at least one of said classes ;
and (c) means for coding frames classified as transitory using a coding technique selected from the group consisting of waveform coding , analysis-by-synthesis coding , codebook excited linear prediction analysis-by-synthesis coding , and multipulse analysis-by-synthesis coding .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US20010023396A1
CLAIM 2
. A method as recited in claim 1 , further comprising the steps of : (a) time aligning a harmonically coded frame in a decoder when the preceding frame (signal classification parameter) has been waveform encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonic coded frame ;
and (b) time aligning the frame in an encoder to be waveform encoded when the subsequent frame is to be harmonically encoded for pairs of adjacent frames comprising a waveform encoded frame followed by a harmonically coded frame .

US20010023396A1
CLAIM 13
. A hybrid speech encoder (last frame, replacement frame) , comprising : (a) means for classifying frames of speech signals as voiced , unvoiced , or transitory ;
(b) means for harmonically coding frames associated with at least one of said classes ;
and (c) means for coding frames classified as transitory using a coding technique selected from the group consisting of waveform coding , analysis-by-synthesis coding , codebook excited linear prediction analysis-by-synthesis coding , and multipulse analysis-by-synthesis coding .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JP2002118517A

Filed: 2000-12-04     Issued: 2002-04-19

直交変換装置及び方法、逆直交変換装置及び方法、変換符号化装置及び方法、並びに復号装置及び方法

(Original Assignee) Sony Corp; ソニー株式会社     

Kenichi Makino, Atsushi Matsumoto, Masayuki Nishiguchi, 淳 松本, 堅一 牧野, 正之 西口
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (フレーム間) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch (上記逆) value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JP2002118517A
CLAIM 3
【請求項3】 窓関数を適切に選んで上記サンプル位置 αを隣接するフレーム間 (onset frame) で合わせることを特徴とする請 求項2記載の直交変換装置。

JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

JP2002118517A
CLAIM 17
【請求項17】 入力信号の特性に応じて決定されるブ ロック長で、逆直交変換時にエイリアシングが生じる境 界となるサンプル位置αを0≦α<Mの範囲で任意に決 定して、入力時系列Mサンプルをオーバーラップさせな がら直交変換して得られた直交変換係数を量子化した量 子化データを復号する復号装置であって、 上記量子化データを逆量子化する逆量子化手段と、 上記逆 (average pitch) 量子化手段で逆量子化されて得られた直交変換係 数を、上記入力信号の特性に応じて決定されたブロック 長で、逆直交変換する逆直交変換手段とを備えることを 特徴とする復号装置。

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (変換係数) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2002118517A
CLAIM 6
【請求項6】 時系列サンプルをオーバーラップさせな がら直交変換して得た直交変換係数 (last frame) を逆直交変換する逆 直交変換装置において、 逆直交変換時にエイリアシングが生じる境界となるサン プル位置αを0≦α<Mの範囲で任意に決定して直交変 換された直交変換係数を逆直交変換することを特徴とす る逆直交変換装置。

JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (変換係数) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JP2002118517A
CLAIM 6
【請求項6】 時系列サンプルをオーバーラップさせな がら直交変換して得た直交変換係数 (last frame) を逆直交変換する逆 直交変換装置において、 逆直交変換時にエイリアシングが生じる境界となるサン プル位置αを0≦α<Mの範囲で任意に決定して直交変 換された直交変換係数を逆直交変換することを特徴とす る逆直交変換装置。

JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (変換係数) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2002118517A
CLAIM 6
【請求項6】 時系列サンプルをオーバーラップさせな がら直交変換して得た直交変換係数 (last frame) を逆直交変換する逆 直交変換装置において、 逆直交変換時にエイリアシングが生じる境界となるサン プル位置αを0≦α<Mの範囲で任意に決定して直交変 換された直交変換係数を逆直交変換することを特徴とす る逆直交変換装置。

JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (フレーム間) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch (上記逆) value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JP2002118517A
CLAIM 3
【請求項3】 窓関数を適切に選んで上記サンプル位置 αを隣接するフレーム間 (onset frame) で合わせることを特徴とする請 求項2記載の直交変換装置。

JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

JP2002118517A
CLAIM 17
【請求項17】 入力信号の特性に応じて決定されるブ ロック長で、逆直交変換時にエイリアシングが生じる境 界となるサンプル位置αを0≦α<Mの範囲で任意に決 定して、入力時系列Mサンプルをオーバーラップさせな がら直交変換して得られた直交変換係数を量子化した量 子化データを復号する復号装置であって、 上記量子化データを逆量子化する逆量子化手段と、 上記逆 (average pitch) 量子化手段で逆量子化されて得られた直交変換係 数を、上記入力信号の特性に応じて決定されたブロック 長で、逆直交変換する逆直交変換手段とを備えることを 特徴とする復号装置。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (変換係数) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2002118517A
CLAIM 6
【請求項6】 時系列サンプルをオーバーラップさせな がら直交変換して得た直交変換係数 (last frame) を逆直交変換する逆 直交変換装置において、 逆直交変換時にエイリアシングが生じる境界となるサン プル位置αを0≦α<Mの範囲で任意に決定して直交変 換された直交変換係数を逆直交変換することを特徴とす る逆直交変換装置。

JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (変換係数) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JP2002118517A
CLAIM 6
【請求項6】 時系列サンプルをオーバーラップさせな がら直交変換して得た直交変換係数 (last frame) を逆直交変換する逆 直交変換装置において、 逆直交変換時にエイリアシングが生じる境界となるサン プル位置αを0≦α<Mの範囲で任意に決定して直交変 換された直交変換係数を逆直交変換することを特徴とす る逆直交変換装置。

JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (変換係数) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JP2002118517A
CLAIM 6
【請求項6】 時系列サンプルをオーバーラップさせな がら直交変換して得た直交変換係数 (last frame) を逆直交変換する逆 直交変換装置において、 逆直交変換時にエイリアシングが生じる境界となるサン プル位置αを0≦α<Mの範囲で任意に決定して直交変 換された直交変換係数を逆直交変換することを特徴とす る逆直交変換装置。

JP2002118517A
CLAIM 14
【請求項14】 上記入力信号は音声信号 (sound signal, speech signal) 及び/又は音 響信号であることを特徴とする請求項8記載の変換符号 化装置。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
EP1096477A2

Filed: 2000-10-25     Issued: 2001-05-02

Apparatus for converting reproducing speed and method of converting reproducing speed

(Original Assignee) Sony Corp     (Current Assignee) Sony Corp

Akira Inoue, Masayuki Nishiguchi
US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP1096477A2
CLAIM 3
The apparatus according to claim 2 , which further comprises delay means for compensating delay in the low-pass filter means , wherein the control means supplies acoustic signals having a length of predetermined number (last non) of pitch cycle from a process-start position into the second accumulating means through the delay means .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP1096477A2
CLAIM 3
The apparatus according to claim 2 , which further comprises delay means for compensating delay in the low-pass filter means , wherein the control means supplies acoustic signals having a length of predetermined number (last non) of pitch cycle from a process-start position into the second accumulating means through the delay means .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP1096477A2
CLAIM 3
The apparatus according to claim 2 , which further comprises delay means for compensating delay in the low-pass filter means , wherein the control means supplies acoustic signals having a length of predetermined number (last non) of pitch cycle from a process-start position into the second accumulating means through the delay means .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP1096477A2
CLAIM 3
The apparatus according to claim 2 , which further comprises delay means for compensating delay in the low-pass filter means , wherein the control means supplies acoustic signals having a length of predetermined number (last non) of pitch cycle from a process-start position into the second accumulating means through the delay means .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP1096477A2
CLAIM 3
The apparatus according to claim 2 , which further comprises delay means for compensating delay in the low-pass filter means , wherein the control means supplies acoustic signals having a length of predetermined number (last non) of pitch cycle from a process-start position into the second accumulating means through the delay means .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
EP1096477A2
CLAIM 3
The apparatus according to claim 2 , which further comprises delay means for compensating delay in the low-pass filter means , wherein the control means supplies acoustic signals having a length of predetermined number (last non) of pitch cycle from a process-start position into the second accumulating means through the delay means .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
EP1199812A1

Filed: 2000-10-20     Issued: 2002-04-24

Perceptually improved encoding of acoustic signals

(Original Assignee) Telefonaktiebolaget LM Ericsson AB     (Current Assignee) Telefonaktiebolaget LM Ericsson AB

Stefan Bruhn, Susanne Olvenstam
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (overlapping region) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP1199812A1
CLAIM 5
A method according to claim 4 , characterised by resulting coefficient values in the overlapping region (first non) s of the frequency bands (i , ii ;
iii) being derived by multiplying each frequency band (i , ii ;
iii) with a window function (W 1 ;
W 2) to obtain corresponding windowed frequency bands , and adding coefficient values of neighbouring windowed frequency bands in each region of overlap .

EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (second buffer memory) is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (overlapping region) erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
EP1199812A1
CLAIM 5
A method according to claim 4 , characterised by resulting coefficient values in the overlapping region (first non) s of the frequency bands (i , ii ;
iii) being derived by multiplying each frequency band (i , ii ;
iii) with a window function (W 1 ;
W 2) to obtain corresponding windowed frequency bands , and adding coefficient values of neighbouring windowed frequency bands in each region of overlap .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (second buffer memory) is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (overlapping region) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP1199812A1
CLAIM 5
A method according to claim 4 , characterised by resulting coefficient values in the overlapping region (first non) s of the frequency bands (i , ii ;
iii) being derived by multiplying each frequency band (i , ii ;
iii) with a window function (W 1 ;
W 2) to obtain corresponding windowed frequency bands , and adding coefficient values of neighbouring windowed frequency bands in each region of overlap .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (overlapping region) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
EP1199812A1
CLAIM 1
A method of encoding an acoustic source signal (x) to produce encoded information (P 1 , P C) for transmission over a transmission medium (306) , comprising : producing , in response to the acoustic source signal (x) , a basic coded signal (P 1) representing perceptually significant characteristics of the acoustic signal (x) , a target signal (r) representing a filtered version of the acoustic source signal (x) , and a primary coded signal (y) representing a reconstructed signal based on the basic coded signal (P 1) , producing , in response to at least one of the primary coded signal (y) and the target signal (r) a corresponding smoothed signal (y E ;
r E) constituting a perceptually improved representation of the primary coded signal (y) respective the target signal (r) , and producing a secondary coded signal (P c) on basis of a combination of either : the smoothed primary coded signal (y E) and the target signal (r) , the primary coded signal (y) and the smoothed target signal (r E) , or the smoothed primary coded signal (y E) and the smoothed target signal (r E) , characterised by the primary coded signal (y) comprising coefficients (K Y) of which each coefficient represents a (LP filter excitation signal) frequency component , the target signal (r) comprising coefficients of which each coefficient represents a frequency component , and the corresponding smoothed signals (y E ;
r E) being selectively modified versions of the primary coded signal (y) respective the target signal (r) wherein a variation is reduced in the coefficient values (K YE) representing frequency information above a threshold value (f T) .

EP1199812A1
CLAIM 5
A method according to claim 4 , characterised by resulting coefficient values in the overlapping region (first non) s of the frequency bands (i , ii ;
iii) being derived by multiplying each frequency band (i , ii ;
iii) with a window function (W 1 ;
W 2) to obtain corresponding windowed frequency bands , and adding coefficient values of neighbouring windowed frequency bands in each region of overlap .

EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (represents a) produced in the decoder during the received first non (overlapping region) erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP1199812A1
CLAIM 1
A method of encoding an acoustic source signal (x) to produce encoded information (P 1 , P C) for transmission over a transmission medium (306) , comprising : producing , in response to the acoustic source signal (x) , a basic coded signal (P 1) representing perceptually significant characteristics of the acoustic signal (x) , a target signal (r) representing a filtered version of the acoustic source signal (x) , and a primary coded signal (y) representing a reconstructed signal based on the basic coded signal (P 1) , producing , in response to at least one of the primary coded signal (y) and the target signal (r) a corresponding smoothed signal (y E ;
r E) constituting a perceptually improved representation of the primary coded signal (y) respective the target signal (r) , and producing a secondary coded signal (P c) on basis of a combination of either : the smoothed primary coded signal (y E) and the target signal (r) , the primary coded signal (y) and the smoothed target signal (r E) , or the smoothed primary coded signal (y E) and the smoothed target signal (r E) , characterised by the primary coded signal (y) comprising coefficients (K Y) of which each coefficient represents a (LP filter excitation signal) frequency component , the target signal (r) comprising coefficients of which each coefficient represents a frequency component , and the corresponding smoothed signals (y E ;
r E) being selectively modified versions of the primary coded signal (y) respective the target signal (r) wherein a variation is reduced in the coefficient values (K YE) representing frequency information above a threshold value (f T) .

EP1199812A1
CLAIM 5
A method according to claim 4 , characterised by resulting coefficient values in the overlapping region (first non) s of the frequency bands (i , ii ;
iii) being derived by multiplying each frequency band (i , ii ;
iii) with a window function (W 1 ;
W 2) to obtain corresponding windowed frequency bands , and adding coefficient values of neighbouring windowed frequency bands in each region of overlap .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (second buffer memory) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (overlapping region) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP1199812A1
CLAIM 1
A method of encoding an acoustic source signal (x) to produce encoded information (P 1 , P C) for transmission over a transmission medium (306) , comprising : producing , in response to the acoustic source signal (x) , a basic coded signal (P 1) representing perceptually significant characteristics of the acoustic signal (x) , a target signal (r) representing a filtered version of the acoustic source signal (x) , and a primary coded signal (y) representing a reconstructed signal based on the basic coded signal (P 1) , producing , in response to at least one of the primary coded signal (y) and the target signal (r) a corresponding smoothed signal (y E ;
r E) constituting a perceptually improved representation of the primary coded signal (y) respective the target signal (r) , and producing a secondary coded signal (P c) on basis of a combination of either : the smoothed primary coded signal (y E) and the target signal (r) , the primary coded signal (y) and the smoothed target signal (r E) , or the smoothed primary coded signal (y E) and the smoothed target signal (r E) , characterised by the primary coded signal (y) comprising coefficients (K Y) of which each coefficient represents a (LP filter excitation signal) frequency component , the target signal (r) comprising coefficients of which each coefficient represents a frequency component , and the corresponding smoothed signals (y E ;
r E) being selectively modified versions of the primary coded signal (y) respective the target signal (r) wherein a variation is reduced in the coefficient values (K YE) representing frequency information above a threshold value (f T) .

EP1199812A1
CLAIM 5
A method according to claim 4 , characterised by resulting coefficient values in the overlapping region (first non) s of the frequency bands (i , ii ;
iii) being derived by multiplying each frequency band (i , ii ;
iii) with a window function (W 1 ;
W 2) to obtain corresponding windowed frequency bands , and adding coefficient values of neighbouring windowed frequency bands in each region of overlap .

EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (overlapping region) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP1199812A1
CLAIM 5
A method according to claim 4 , characterised by resulting coefficient values in the overlapping region (first non) s of the frequency bands (i , ii ;
iii) being derived by multiplying each frequency band (i , ii ;
iii) with a window function (W 1 ;
W 2) to obtain corresponding windowed frequency bands , and adding coefficient values of neighbouring windowed frequency bands in each region of overlap .

EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (second buffer memory) is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (overlapping region) erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
EP1199812A1
CLAIM 5
A method according to claim 4 , characterised by resulting coefficient values in the overlapping region (first non) s of the frequency bands (i , ii ;
iii) being derived by multiplying each frequency band (i , ii ;
iii) with a window function (W 1 ;
W 2) to obtain corresponding windowed frequency bands , and adding coefficient values of neighbouring windowed frequency bands in each region of overlap .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (second buffer memory) is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (overlapping region) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP1199812A1
CLAIM 5
A method according to claim 4 , characterised by resulting coefficient values in the overlapping region (first non) s of the frequency bands (i , ii ;
iii) being derived by multiplying each frequency band (i , ii ;
iii) with a window function (W 1 ;
W 2) to obtain corresponding windowed frequency bands , and adding coefficient values of neighbouring windowed frequency bands in each region of overlap .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non (overlapping region) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
EP1199812A1
CLAIM 1
A method of encoding an acoustic source signal (x) to produce encoded information (P 1 , P C) for transmission over a transmission medium (306) , comprising : producing , in response to the acoustic source signal (x) , a basic coded signal (P 1) representing perceptually significant characteristics of the acoustic signal (x) , a target signal (r) representing a filtered version of the acoustic source signal (x) , and a primary coded signal (y) representing a reconstructed signal based on the basic coded signal (P 1) , producing , in response to at least one of the primary coded signal (y) and the target signal (r) a corresponding smoothed signal (y E ;
r E) constituting a perceptually improved representation of the primary coded signal (y) respective the target signal (r) , and producing a secondary coded signal (P c) on basis of a combination of either : the smoothed primary coded signal (y E) and the target signal (r) , the primary coded signal (y) and the smoothed target signal (r E) , or the smoothed primary coded signal (y E) and the smoothed target signal (r E) , characterised by the primary coded signal (y) comprising coefficients (K Y) of which each coefficient represents a (LP filter excitation signal) frequency component , the target signal (r) comprising coefficients of which each coefficient represents a frequency component , and the corresponding smoothed signals (y E ;
r E) being selectively modified versions of the primary coded signal (y) respective the target signal (r) wherein a variation is reduced in the coefficient values (K YE) representing frequency information above a threshold value (f T) .

EP1199812A1
CLAIM 5
A method according to claim 4 , characterised by resulting coefficient values in the overlapping region (first non) s of the frequency bands (i , ii ;
iii) being derived by multiplying each frequency band (i , ii ;
iii) with a window function (W 1 ;
W 2) to obtain corresponding windowed frequency bands , and adding coefficient values of neighbouring windowed frequency bands in each region of overlap .

EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (represents a) produced in the decoder during the received first non (overlapping region) erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP1199812A1
CLAIM 1
A method of encoding an acoustic source signal (x) to produce encoded information (P 1 , P C) for transmission over a transmission medium (306) , comprising : producing , in response to the acoustic source signal (x) , a basic coded signal (P 1) representing perceptually significant characteristics of the acoustic signal (x) , a target signal (r) representing a filtered version of the acoustic source signal (x) , and a primary coded signal (y) representing a reconstructed signal based on the basic coded signal (P 1) , producing , in response to at least one of the primary coded signal (y) and the target signal (r) a corresponding smoothed signal (y E ;
r E) constituting a perceptually improved representation of the primary coded signal (y) respective the target signal (r) , and producing a secondary coded signal (P c) on basis of a combination of either : the smoothed primary coded signal (y E) and the target signal (r) , the primary coded signal (y) and the smoothed target signal (r E) , or the smoothed primary coded signal (y E) and the smoothed target signal (r E) , characterised by the primary coded signal (y) comprising coefficients (K Y) of which each coefficient represents a (LP filter excitation signal) frequency component , the target signal (r) comprising coefficients of which each coefficient represents a frequency component , and the corresponding smoothed signals (y E ;
r E) being selectively modified versions of the primary coded signal (y) respective the target signal (r) wherein a variation is reduced in the coefficient values (K YE) representing frequency information above a threshold value (f T) .

EP1199812A1
CLAIM 5
A method according to claim 4 , characterised by resulting coefficient values in the overlapping region (first non) s of the frequency bands (i , ii ;
iii) being derived by multiplying each frequency band (i , ii ;
iii) with a window function (W 1 ;
W 2) to obtain corresponding windowed frequency bands , and adding coefficient values of neighbouring windowed frequency bands in each region of overlap .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (second buffer memory) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (second buffer memory) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non (overlapping region) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
EP1199812A1
CLAIM 1
A method of encoding an acoustic source signal (x) to produce encoded information (P 1 , P C) for transmission over a transmission medium (306) , comprising : producing , in response to the acoustic source signal (x) , a basic coded signal (P 1) representing perceptually significant characteristics of the acoustic signal (x) , a target signal (r) representing a filtered version of the acoustic source signal (x) , and a primary coded signal (y) representing a reconstructed signal based on the basic coded signal (P 1) , producing , in response to at least one of the primary coded signal (y) and the target signal (r) a corresponding smoothed signal (y E ;
r E) constituting a perceptually improved representation of the primary coded signal (y) respective the target signal (r) , and producing a secondary coded signal (P c) on basis of a combination of either : the smoothed primary coded signal (y E) and the target signal (r) , the primary coded signal (y) and the smoothed target signal (r E) , or the smoothed primary coded signal (y E) and the smoothed target signal (r E) , characterised by the primary coded signal (y) comprising coefficients (K Y) of which each coefficient represents a (LP filter excitation signal) frequency component , the target signal (r) comprising coefficients of which each coefficient represents a frequency component , and the corresponding smoothed signals (y E ;
r E) being selectively modified versions of the primary coded signal (y) respective the target signal (r) wherein a variation is reduced in the coefficient values (K YE) representing frequency information above a threshold value (f T) .

EP1199812A1
CLAIM 5
A method according to claim 4 , characterised by resulting coefficient values in the overlapping region (first non) s of the frequency bands (i , ii ;
iii) being derived by multiplying each frequency band (i , ii ;
iii) with a window function (W 1 ;
W 2) to obtain corresponding windowed frequency bands , and adding coefficient values of neighbouring windowed frequency bands in each region of overlap .

EP1199812A1
CLAIM 25
A method according to any of the claims 15 - 24 , characterised by the at least one transmitted enhanced coded signal (P and (E)) comprising a first estimate (P and 1) of a first coded signal (P 1) constituting a coded representation (energy information parameter, phase information parameter) of the acoustic signal (x) , and a second estimate (P and c) of a secondary coded signal (P C) indicating how well the first coded signal (P 1) describes the acoustic signal (x) .

EP1199812A1
CLAIM 32
A transmitter according to claim 31 , characterised in that the at least one spectral smoothing unit (305a , 305b) comprises : a first buffer memory (401) to store coefficients (K Y) of the input signal (P) , each coefficient (K P) representing a frequency component , a processing unit (402) to calculate , for coefficients (k Y n+1 - k Y m) corresponding to frequency components above the threshold value (f T) , an average coefficient value (K i , K ii ;
K iii) of the coefficients (k P n+1 - k P m) stored in the first buffer memory (401) for each of at least one frequency band (i , ii ;
iii) , a second buffer memory (sound signal) (403) to repeatedly store the respective average coefficient value (K i , K ii ;
K iii) for the each frequency band (i , ii ;
iii) as many times as there are corresponding coefficients (K P) of the at least one basic coded signal (P) in the particular frequency band (i , ii ;
iii) , and a read-out unit (404) to read out coefficients (k Y 1 - k Y n) up to the threshold value (f T) from the first buffer memory (401) and to read out coefficients (k Y n+1 - k Y m) above the threshold value (f T) from the second buffer memory (403) to form the coefficients (K YE) of the output signal (y E) .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
EP1087379A2

Filed: 2000-09-26     Issued: 2001-03-28

Quantization errors correction method in a audio decoder

(Original Assignee) Pioneer Corp     (Current Assignee) Pioneer Corp

Soichi Pioneer Corporation Toyama
US7693710B2
CLAIM 1
. A method of concealing frame erasure (code values) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (coding device) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (code values) from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
EP1087379A2
CLAIM 4
A quantization error correcting device (2a) for correcting quantization error included in audio information at the time of decoding , the audio information being divided into a plurality of frequency bands and compressive-encoded for each frequency band with bit allocation determined based on audible frequency characteristic , the device comprising : a memory (21) for storing correction values for correcting the encoded values for each frequency band , the correction values being calculated based on , at least , an error between the encoded value of the compressive-encoded audio information and audio information value before compressive-encoding , and a level of the encoded value in other correlated ones of the encode values (average pitch value, concealing frame erasure) ;
and an outputting unit (20 , 22) for reading out the correction value from the memory based on the bit allocation information indicating the bit allocation and the encoded value , and for outputting decoded value corresponding to the encoded value for each frequency band based on the correction value read out from the memory and the encoded value .

EP1087379A2
CLAIM 7
An audio information decoding device (pitch period) comprising : the quantization error correction device according to any one of claims 1 to 6 ;
and a decoding unit (3) for applying decoding processing corresponding to the compressive-encoding of the audio information onto the outputted decoded value and for outputting decoding result .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (code values) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
EP1087379A2
CLAIM 4
A quantization error correcting device (2a) for correcting quantization error included in audio information at the time of decoding , the audio information being divided into a plurality of frequency bands and compressive-encoded for each frequency band with bit allocation determined based on audible frequency characteristic , the device comprising : a memory (21) for storing correction values for correcting the encoded values for each frequency band , the correction values being calculated based on , at least , an error between the encoded value of the compressive-encoded audio information and audio information value before compressive-encoding , and a level of the encoded value in other correlated ones of the encode values (average pitch value, concealing frame erasure) ;
and an outputting unit (20 , 22) for reading out the correction value from the memory based on the bit allocation information indicating the bit allocation and the encoded value , and for outputting decoded value corresponding to the encoded value for each frequency band based on the correction value read out from the memory and the encoded value .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (code values) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (coding device) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP1087379A2
CLAIM 4
A quantization error correcting device (2a) for correcting quantization error included in audio information at the time of decoding , the audio information being divided into a plurality of frequency bands and compressive-encoded for each frequency band with bit allocation determined based on audible frequency characteristic , the device comprising : a memory (21) for storing correction values for correcting the encoded values for each frequency band , the correction values being calculated based on , at least , an error between the encoded value of the compressive-encoded audio information and audio information value before compressive-encoding , and a level of the encoded value in other correlated ones of the encode values (average pitch value, concealing frame erasure) ;
and an outputting unit (20 , 22) for reading out the correction value from the memory based on the bit allocation information indicating the bit allocation and the encoded value , and for outputting decoded value corresponding to the encoded value for each frequency band based on the correction value read out from the memory and the encoded value .

EP1087379A2
CLAIM 7
An audio information decoding device (pitch period) comprising : the quantization error correction device according to any one of claims 1 to 6 ;
and a decoding unit (3) for applying decoding processing corresponding to the compressive-encoding of the audio information onto the outputted decoded value and for outputting decoding result .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (code values) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
EP1087379A2
CLAIM 4
A quantization error correcting device (2a) for correcting quantization error included in audio information at the time of decoding , the audio information being divided into a plurality of frequency bands and compressive-encoded for each frequency band with bit allocation determined based on audible frequency characteristic , the device comprising : a memory (21) for storing correction values for correcting the encoded values for each frequency band , the correction values being calculated based on , at least , an error between the encoded value of the compressive-encoded audio information and audio information value before compressive-encoding , and a level of the encoded value in other correlated ones of the encode values (average pitch value, concealing frame erasure) ;
and an outputting unit (20 , 22) for reading out the correction value from the memory based on the bit allocation information indicating the bit allocation and the encoded value , and for outputting decoded value corresponding to the encoded value for each frequency band based on the correction value read out from the memory and the encoded value .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (code values) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP1087379A2
CLAIM 4
A quantization error correcting device (2a) for correcting quantization error included in audio information at the time of decoding , the audio information being divided into a plurality of frequency bands and compressive-encoded for each frequency band with bit allocation determined based on audible frequency characteristic , the device comprising : a memory (21) for storing correction values for correcting the encoded values for each frequency band , the correction values being calculated based on , at least , an error between the encoded value of the compressive-encoded audio information and audio information value before compressive-encoding , and a level of the encoded value in other correlated ones of the encode values (average pitch value, concealing frame erasure) ;
and an outputting unit (20 , 22) for reading out the correction value from the memory based on the bit allocation information indicating the bit allocation and the encoded value , and for outputting decoded value corresponding to the encoded value for each frequency band based on the correction value read out from the memory and the encoded value .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (quantization errors) and the first non erased frame received after frame erasure is encoded as active speech .
EP1087379A2
CLAIM 1
A quantization error correcting device (2) for correcting quantization error included in audio information at the time of decoding , the audio information being divided into a plurality of frequency bands and compressive-encoded for each frequency band with bit allocation determined based on audible frequency characteristic , the device comprising : a detecting unit (10) for detecting , based on bit allocation information indicating bit allocation and encoded values of the compressive-encoded audio information , a range of quantization error indicating a range in which audio information value before compressive-encoding corresponding to the encoded value exists ;
and an outputting unit (11) for outputting a decoded value corresponding to one of the encoded values based on the detected range of quantization error and the ranges of quantization errors (comfort noise) of other correlated ones of the encoded values .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (code values) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
EP1087379A2
CLAIM 4
A quantization error correcting device (2a) for correcting quantization error included in audio information at the time of decoding , the audio information being divided into a plurality of frequency bands and compressive-encoded for each frequency band with bit allocation determined based on audible frequency characteristic , the device comprising : a memory (21) for storing correction values for correcting the encoded values for each frequency band , the correction values being calculated based on , at least , an error between the encoded value of the compressive-encoded audio information and audio information value before compressive-encoding , and a level of the encoded value in other correlated ones of the encode values (average pitch value, concealing frame erasure) ;
and an outputting unit (20 , 22) for reading out the correction value from the memory based on the bit allocation information indicating the bit allocation and the encoded value , and for outputting decoded value corresponding to the encoded value for each frequency band based on the correction value read out from the memory and the encoded value .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (code values) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
EP1087379A2
CLAIM 4
A quantization error correcting device (2a) for correcting quantization error included in audio information at the time of decoding , the audio information being divided into a plurality of frequency bands and compressive-encoded for each frequency band with bit allocation determined based on audible frequency characteristic , the device comprising : a memory (21) for storing correction values for correcting the encoded values for each frequency band , the correction values being calculated based on , at least , an error between the encoded value of the compressive-encoded audio information and audio information value before compressive-encoding , and a level of the encoded value in other correlated ones of the encode values (average pitch value, concealing frame erasure) ;
and an outputting unit (20 , 22) for reading out the correction value from the memory based on the bit allocation information indicating the bit allocation and the encoded value , and for outputting decoded value corresponding to the encoded value for each frequency band based on the correction value read out from the memory and the encoded value .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (code values) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (coding device) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP1087379A2
CLAIM 4
A quantization error correcting device (2a) for correcting quantization error included in audio information at the time of decoding , the audio information being divided into a plurality of frequency bands and compressive-encoded for each frequency band with bit allocation determined based on audible frequency characteristic , the device comprising : a memory (21) for storing correction values for correcting the encoded values for each frequency band , the correction values being calculated based on , at least , an error between the encoded value of the compressive-encoded audio information and audio information value before compressive-encoding , and a level of the encoded value in other correlated ones of the encode values (average pitch value, concealing frame erasure) ;
and an outputting unit (20 , 22) for reading out the correction value from the memory based on the bit allocation information indicating the bit allocation and the encoded value , and for outputting decoded value corresponding to the encoded value for each frequency band based on the correction value read out from the memory and the encoded value .

EP1087379A2
CLAIM 7
An audio information decoding device (pitch period) comprising : the quantization error correction device according to any one of claims 1 to 6 ;
and a decoding unit (3) for applying decoding processing corresponding to the compressive-encoding of the audio information onto the outputted decoded value and for outputting decoding result .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (coding device) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (code values) from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
EP1087379A2
CLAIM 4
A quantization error correcting device (2a) for correcting quantization error included in audio information at the time of decoding , the audio information being divided into a plurality of frequency bands and compressive-encoded for each frequency band with bit allocation determined based on audible frequency characteristic , the device comprising : a memory (21) for storing correction values for correcting the encoded values for each frequency band , the correction values being calculated based on , at least , an error between the encoded value of the compressive-encoded audio information and audio information value before compressive-encoding , and a level of the encoded value in other correlated ones of the encode values (average pitch value, concealing frame erasure) ;
and an outputting unit (20 , 22) for reading out the correction value from the memory based on the bit allocation information indicating the bit allocation and the encoded value , and for outputting decoded value corresponding to the encoded value for each frequency band based on the correction value read out from the memory and the encoded value .

EP1087379A2
CLAIM 7
An audio information decoding device (pitch period) comprising : the quantization error correction device according to any one of claims 1 to 6 ;
and a decoding unit (3) for applying decoding processing corresponding to the compressive-encoding of the audio information onto the outputted decoded value and for outputting decoding result .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (coding device) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP1087379A2
CLAIM 7
An audio information decoding device (pitch period) comprising : the quantization error correction device according to any one of claims 1 to 6 ;
and a decoding unit (3) for applying decoding processing corresponding to the compressive-encoding of the audio information onto the outputted decoded value and for outputting decoding result .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (quantization errors) and the first non erased frame received after frame erasure is encoded as active speech .
EP1087379A2
CLAIM 1
A quantization error correcting device (2) for correcting quantization error included in audio information at the time of decoding , the audio information being divided into a plurality of frequency bands and compressive-encoded for each frequency band with bit allocation determined based on audible frequency characteristic , the device comprising : a detecting unit (10) for detecting , based on bit allocation information indicating bit allocation and encoded values of the compressive-encoded audio information , a range of quantization error indicating a range in which audio information value before compressive-encoding corresponding to the encoded value exists ;
and an outputting unit (11) for outputting a decoded value corresponding to one of the encoded values based on the detected range of quantization error and the ranges of quantization errors (comfort noise) of other correlated ones of the encoded values .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (coding device) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP1087379A2
CLAIM 7
An audio information decoding device (pitch period) comprising : the quantization error correction device according to any one of claims 1 to 6 ;
and a decoding unit (3) for applying decoding processing corresponding to the compressive-encoding of the audio information onto the outputted decoded value and for outputting decoding result .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
EP1132892A1

Filed: 2000-08-23     Issued: 2001-09-12

Voice encoder and voice encoding method

(Original Assignee) Panasonic Corp     (Current Assignee) Panasonic Corp

Kazutoshi Yasunaga, Toshiyuki Morii
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy (medium storing) of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

EP1132892A1
CLAIM 13
A computer-readable recording medium storing (controlling energy) a speech coding program , an adaptive codebook storing past synthesized excitation vector signals and a stochastic codebook storing a plurality of excitation vectors , said speech coding program comprising the steps of : obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in said adaptive codebook and said stochastic codebook using an LPC coefficient obtained from an input speech ;
calculating gains of said adaptive excitation vectors and said stochastic excitation vectors ;
performing vector quantization on the adaptive excitation vectors and stochastic excitation vectors determined using coding distortion between said input speech and said synthesized speech , and said gains , wherein said vector quantization step further comprising the steps of : determining a quantization target vector based on coding distortion between a plurality of quantization target vectors and prediction coefficients used for predictive coding ;
and adjusting said prediction coefficients according to the state of a previous subframe .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
EP1132892A1
CLAIM 1
A speech coder comprising : LPC synthesizing means for obtaining a synthesized speech by filtering adaptive excitation vectors and stochastic excitation vectors stored in an adaptive codebook (sound signal, speech signal) and stochastic codebook using an LPC coefficient obtained from an input speech ;
gain calculating means for calculating gains of said adaptive excitation vectors and said stochastic excitation vectors and searching codes of the adaptive excitation vectors and stochastic excitation vectors using coding distortion between said input speech and said synthesized speech obtained using said gains ;
and parameter coding means for performing predictive coding of gains using the adaptive excitation vectors and stochastic excitation vectors corresponding to the codes obtained , wherein said parameter coding means comprises prediction coefficient adjusting means for adjusting a prediction coefficient used for said predictive coding according to the state of a previous subframe .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
EP1074976A2

Filed: 2000-08-04     Issued: 2001-02-07

Block switching based subband audio coder

(Original Assignee) Ricoh Co Ltd     (Current Assignee) Ricoh Co Ltd

Tadashi Maison de Wings D-gou 449-16 Araki
US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (time t) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP1074976A2
CLAIM 3
A method of coding a digital acoustic signal comprising the steps of : inputting a digital acoustic signal along a time axis ;
dividing said digital acoustic signal into blocks therealong by use of a computer ;
practicing processings including a sub-band division or conversion to frequency area per each of the respective blocks ;
dividing said acoustic signal into plural band widths ;
allocating coded bits to each of said respective band widths ;
obtaining a normalized coefficient , corresponding to the coded bit number of the allocated bits ;
and compressing and coding said digital acoustic signal by quantizing said acoustic signal with said normalized coefficient , wherein , when the conversion to said frequency area is performed , said acoustic signal divided into the blocks is converted to either one of a long conversion block or plural short conversion blocks ;
wherein , when said short conversion blocks are employed , said plural short conversion blocks are divided into the groups of plural blocks respectively including one or plural short conversion blocks ;
and wherein said acoustic signal is practiced to quantize causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said method further comprising the steps of : calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
obtaining said sum total in the frame of said calculated sensation entropy ;
comparing the absolute value of the difference between the respective sum totals in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and judging the later frame among said two frames successive in the elapsing time t (first non) o be converted by said short blocks , when said absolute value is larger than said threshold value , and judging the later frame among said two frames successive in the elapsing time to be converted by said long block , when said absolute value is smaller than said threshold value .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (time t) erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
EP1074976A2
CLAIM 3
A method of coding a digital acoustic signal comprising the steps of : inputting a digital acoustic signal along a time axis ;
dividing said digital acoustic signal into blocks therealong by use of a computer ;
practicing processings including a sub-band division or conversion to frequency area per each of the respective blocks ;
dividing said acoustic signal into plural band widths ;
allocating coded bits to each of said respective band widths ;
obtaining a normalized coefficient , corresponding to the coded bit number of the allocated bits ;
and compressing and coding said digital acoustic signal by quantizing said acoustic signal with said normalized coefficient , wherein , when the conversion to said frequency area is performed , said acoustic signal divided into the blocks is converted to either one of a long conversion block or plural short conversion blocks ;
wherein , when said short conversion blocks are employed , said plural short conversion blocks are divided into the groups of plural blocks respectively including one or plural short conversion blocks ;
and wherein said acoustic signal is practiced to quantize causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said method further comprising the steps of : calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
obtaining said sum total in the frame of said calculated sensation entropy ;
comparing the absolute value of the difference between the respective sum totals in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and judging the later frame among said two frames successive in the elapsing time t (first non) o be converted by said short blocks , when said absolute value is larger than said threshold value , and judging the later frame among said two frames successive in the elapsing time to be converted by said long block , when said absolute value is smaller than said threshold value .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (time t) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP1074976A2
CLAIM 3
A method of coding a digital acoustic signal comprising the steps of : inputting a digital acoustic signal along a time axis ;
dividing said digital acoustic signal into blocks therealong by use of a computer ;
practicing processings including a sub-band division or conversion to frequency area per each of the respective blocks ;
dividing said acoustic signal into plural band widths ;
allocating coded bits to each of said respective band widths ;
obtaining a normalized coefficient , corresponding to the coded bit number of the allocated bits ;
and compressing and coding said digital acoustic signal by quantizing said acoustic signal with said normalized coefficient , wherein , when the conversion to said frequency area is performed , said acoustic signal divided into the blocks is converted to either one of a long conversion block or plural short conversion blocks ;
wherein , when said short conversion blocks are employed , said plural short conversion blocks are divided into the groups of plural blocks respectively including one or plural short conversion blocks ;
and wherein said acoustic signal is practiced to quantize causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said method further comprising the steps of : calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
obtaining said sum total in the frame of said calculated sensation entropy ;
comparing the absolute value of the difference between the respective sum totals in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and judging the later frame among said two frames successive in the elapsing time t (first non) o be converted by said short blocks , when said absolute value is larger than said threshold value , and judging the later frame among said two frames successive in the elapsing time to be converted by said long block , when said absolute value is smaller than said threshold value .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
EP1074976A2
CLAIM 3
A method of coding a digital acoustic signal comprising the steps of : inputting a digital acoustic signal along a time axis ;
dividing said digital acoustic signal into blocks therealong by use of a computer ;
practicing processings including a sub-band division or conversion to frequency area per each of the respective blocks ;
dividing said acoustic signal into plural band widths ;
allocating coded bits to each of said respective band widths ;
obtaining a normalized coefficient , corresponding to the coded bit number of the allocated bits ;
and compressing and coding said digital acoustic signal by quantizing said acoustic signal with said normalized coefficient , wherein , when the conversion to said frequency area is performed , said acoustic signal divided into the blocks is converted to either one of a long conversion block or plural short conversion blocks ;
wherein , when said short conversion blocks are employed , said plural short conversion blocks are divided into the groups of plural blocks respectively including one or plural short conversion blocks ;
and wherein said acoustic signal is practiced to quantize causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said method further comprising the steps of : calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
obtaining said sum total in the frame of said calculated sensation entropy ;
comparing the absolute value of the difference between the respective sum totals in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and judging the later frame among said two frames successive in the elapsing time t (first non) o be converted by said short blocks , when said absolute value is larger than said threshold value , and judging the later frame among said two frames successive in the elapsing time to be converted by said long block , when said absolute value is smaller than said threshold value .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non (time t) erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (comparison means) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP1074976A2
CLAIM 3
A method of coding a digital acoustic signal comprising the steps of : inputting a digital acoustic signal along a time axis ;
dividing said digital acoustic signal into blocks therealong by use of a computer ;
practicing processings including a sub-band division or conversion to frequency area per each of the respective blocks ;
dividing said acoustic signal into plural band widths ;
allocating coded bits to each of said respective band widths ;
obtaining a normalized coefficient , corresponding to the coded bit number of the allocated bits ;
and compressing and coding said digital acoustic signal by quantizing said acoustic signal with said normalized coefficient , wherein , when the conversion to said frequency area is performed , said acoustic signal divided into the blocks is converted to either one of a long conversion block or plural short conversion blocks ;
wherein , when said short conversion blocks are employed , said plural short conversion blocks are divided into the groups of plural blocks respectively including one or plural short conversion blocks ;
and wherein said acoustic signal is practiced to quantize causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said method further comprising the steps of : calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
obtaining said sum total in the frame of said calculated sensation entropy ;
comparing the absolute value of the difference between the respective sum totals in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and judging the later frame among said two frames successive in the elapsing time t (first non) o be converted by said short blocks , when said absolute value is larger than said threshold value , and judging the later frame among said two frames successive in the elapsing time to be converted by said long block , when said absolute value is smaller than said threshold value .

EP1074976A2
CLAIM 5
Digital acoustic signal coding apparatus comprising means in which a digital acoustic signal is inputted along a time axis and divided into blocks therealong , processings including a sub-band division and conversion to frequency area are practiced per each of the respective block , said acoustic signal is divided into plural band widths , coded bits are allocated to each of said respective band widths , a normalized coefficient is obtained corresponding to the coded bit number of the allocated bits , and said digital acoustic signal is compressed and coded by quantizing said acoustic signal with said normalized coefficient , means for converting said acoustic signal divided into the blocks to either one of a long conversion block or plural short conversion blocks , when the conversion to said frequency area is performed ;
means for dividing said plural short conversion blocks into the groups of plural blocks respectively including one or plural short conversion blocks , when said short conversion blocks are employed ;
and means for quantizing said acoustic signal , causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said digital acoustic signal coding apparatus further comprising : sensation entropy calculation means (12) for calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
sensation entropy total sum calculation means (13) for obtaining said total sum in the frame of said sensation entropy calculated by said sensation entropy calculation medium (12) ;
comparison means (current frame) (14) for comparing the absolute value of the difference between the respective total sums in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and long/short blocks judgment means (15) for judging whether said long block or said short blocks should convert the block of said input acoustic signal on the basis of the comparison result obtained by said comparison medium .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (comparison means) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP1074976A2
CLAIM 3
A method of coding a digital acoustic signal comprising the steps of : inputting a digital acoustic signal along a time axis ;
dividing said digital acoustic signal into blocks therealong by use of a computer ;
practicing processings including a sub-band division or conversion to frequency area per each of the respective blocks ;
dividing said acoustic signal into plural band widths ;
allocating coded bits to each of said respective band widths ;
obtaining a normalized coefficient , corresponding to the coded bit number of the allocated bits ;
and compressing and coding said digital acoustic signal by quantizing said acoustic signal with said normalized coefficient , wherein , when the conversion to said frequency area is performed , said acoustic signal divided into the blocks is converted to either one of a long conversion block or plural short conversion blocks ;
wherein , when said short conversion blocks are employed , said plural short conversion blocks are divided into the groups of plural blocks respectively including one or plural short conversion blocks ;
and wherein said acoustic signal is practiced to quantize causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said method further comprising the steps of : calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
obtaining said sum total in the frame of said calculated sensation entropy ;
comparing the absolute value of the difference between the respective sum totals in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and judging the later frame among said two frames successive in the elapsing time t (first non) o be converted by said short blocks , when said absolute value is larger than said threshold value , and judging the later frame among said two frames successive in the elapsing time to be converted by said long block , when said absolute value is smaller than said threshold value .

EP1074976A2
CLAIM 5
Digital acoustic signal coding apparatus comprising means in which a digital acoustic signal is inputted along a time axis and divided into blocks therealong , processings including a sub-band division and conversion to frequency area are practiced per each of the respective block , said acoustic signal is divided into plural band widths , coded bits are allocated to each of said respective band widths , a normalized coefficient is obtained corresponding to the coded bit number of the allocated bits , and said digital acoustic signal is compressed and coded by quantizing said acoustic signal with said normalized coefficient , means for converting said acoustic signal divided into the blocks to either one of a long conversion block or plural short conversion blocks , when the conversion to said frequency area is performed ;
means for dividing said plural short conversion blocks into the groups of plural blocks respectively including one or plural short conversion blocks , when said short conversion blocks are employed ;
and means for quantizing said acoustic signal , causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said digital acoustic signal coding apparatus further comprising : sensation entropy calculation means (12) for calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
sensation entropy total sum calculation means (13) for obtaining said total sum in the frame of said sensation entropy calculated by said sensation entropy calculation medium (12) ;
comparison means (current frame) (14) for comparing the absolute value of the difference between the respective total sums in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and long/short blocks judgment means (15) for judging whether said long block or said short blocks should convert the block of said input acoustic signal on the basis of the comparison result obtained by said comparison medium .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (time t) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP1074976A2
CLAIM 3
A method of coding a digital acoustic signal comprising the steps of : inputting a digital acoustic signal along a time axis ;
dividing said digital acoustic signal into blocks therealong by use of a computer ;
practicing processings including a sub-band division or conversion to frequency area per each of the respective blocks ;
dividing said acoustic signal into plural band widths ;
allocating coded bits to each of said respective band widths ;
obtaining a normalized coefficient , corresponding to the coded bit number of the allocated bits ;
and compressing and coding said digital acoustic signal by quantizing said acoustic signal with said normalized coefficient , wherein , when the conversion to said frequency area is performed , said acoustic signal divided into the blocks is converted to either one of a long conversion block or plural short conversion blocks ;
wherein , when said short conversion blocks are employed , said plural short conversion blocks are divided into the groups of plural blocks respectively including one or plural short conversion blocks ;
and wherein said acoustic signal is practiced to quantize causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said method further comprising the steps of : calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
obtaining said sum total in the frame of said calculated sensation entropy ;
comparing the absolute value of the difference between the respective sum totals in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and judging the later frame among said two frames successive in the elapsing time t (first non) o be converted by said short blocks , when said absolute value is larger than said threshold value , and judging the later frame among said two frames successive in the elapsing time to be converted by said long block , when said absolute value is smaller than said threshold value .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (time t) erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
EP1074976A2
CLAIM 3
A method of coding a digital acoustic signal comprising the steps of : inputting a digital acoustic signal along a time axis ;
dividing said digital acoustic signal into blocks therealong by use of a computer ;
practicing processings including a sub-band division or conversion to frequency area per each of the respective blocks ;
dividing said acoustic signal into plural band widths ;
allocating coded bits to each of said respective band widths ;
obtaining a normalized coefficient , corresponding to the coded bit number of the allocated bits ;
and compressing and coding said digital acoustic signal by quantizing said acoustic signal with said normalized coefficient , wherein , when the conversion to said frequency area is performed , said acoustic signal divided into the blocks is converted to either one of a long conversion block or plural short conversion blocks ;
wherein , when said short conversion blocks are employed , said plural short conversion blocks are divided into the groups of plural blocks respectively including one or plural short conversion blocks ;
and wherein said acoustic signal is practiced to quantize causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said method further comprising the steps of : calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
obtaining said sum total in the frame of said calculated sensation entropy ;
comparing the absolute value of the difference between the respective sum totals in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and judging the later frame among said two frames successive in the elapsing time t (first non) o be converted by said short blocks , when said absolute value is larger than said threshold value , and judging the later frame among said two frames successive in the elapsing time to be converted by said long block , when said absolute value is smaller than said threshold value .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (time t) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP1074976A2
CLAIM 3
A method of coding a digital acoustic signal comprising the steps of : inputting a digital acoustic signal along a time axis ;
dividing said digital acoustic signal into blocks therealong by use of a computer ;
practicing processings including a sub-band division or conversion to frequency area per each of the respective blocks ;
dividing said acoustic signal into plural band widths ;
allocating coded bits to each of said respective band widths ;
obtaining a normalized coefficient , corresponding to the coded bit number of the allocated bits ;
and compressing and coding said digital acoustic signal by quantizing said acoustic signal with said normalized coefficient , wherein , when the conversion to said frequency area is performed , said acoustic signal divided into the blocks is converted to either one of a long conversion block or plural short conversion blocks ;
wherein , when said short conversion blocks are employed , said plural short conversion blocks are divided into the groups of plural blocks respectively including one or plural short conversion blocks ;
and wherein said acoustic signal is practiced to quantize causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said method further comprising the steps of : calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
obtaining said sum total in the frame of said calculated sensation entropy ;
comparing the absolute value of the difference between the respective sum totals in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and judging the later frame among said two frames successive in the elapsing time t (first non) o be converted by said short blocks , when said absolute value is larger than said threshold value , and judging the later frame among said two frames successive in the elapsing time to be converted by said long block , when said absolute value is smaller than said threshold value .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
EP1074976A2
CLAIM 3
A method of coding a digital acoustic signal comprising the steps of : inputting a digital acoustic signal along a time axis ;
dividing said digital acoustic signal into blocks therealong by use of a computer ;
practicing processings including a sub-band division or conversion to frequency area per each of the respective blocks ;
dividing said acoustic signal into plural band widths ;
allocating coded bits to each of said respective band widths ;
obtaining a normalized coefficient , corresponding to the coded bit number of the allocated bits ;
and compressing and coding said digital acoustic signal by quantizing said acoustic signal with said normalized coefficient , wherein , when the conversion to said frequency area is performed , said acoustic signal divided into the blocks is converted to either one of a long conversion block or plural short conversion blocks ;
wherein , when said short conversion blocks are employed , said plural short conversion blocks are divided into the groups of plural blocks respectively including one or plural short conversion blocks ;
and wherein said acoustic signal is practiced to quantize causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said method further comprising the steps of : calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
obtaining said sum total in the frame of said calculated sensation entropy ;
comparing the absolute value of the difference between the respective sum totals in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and judging the later frame among said two frames successive in the elapsing time t (first non) o be converted by said short blocks , when said absolute value is larger than said threshold value , and judging the later frame among said two frames successive in the elapsing time to be converted by said long block , when said absolute value is smaller than said threshold value .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non (time t) erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (comparison means) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP1074976A2
CLAIM 3
A method of coding a digital acoustic signal comprising the steps of : inputting a digital acoustic signal along a time axis ;
dividing said digital acoustic signal into blocks therealong by use of a computer ;
practicing processings including a sub-band division or conversion to frequency area per each of the respective blocks ;
dividing said acoustic signal into plural band widths ;
allocating coded bits to each of said respective band widths ;
obtaining a normalized coefficient , corresponding to the coded bit number of the allocated bits ;
and compressing and coding said digital acoustic signal by quantizing said acoustic signal with said normalized coefficient , wherein , when the conversion to said frequency area is performed , said acoustic signal divided into the blocks is converted to either one of a long conversion block or plural short conversion blocks ;
wherein , when said short conversion blocks are employed , said plural short conversion blocks are divided into the groups of plural blocks respectively including one or plural short conversion blocks ;
and wherein said acoustic signal is practiced to quantize causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said method further comprising the steps of : calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
obtaining said sum total in the frame of said calculated sensation entropy ;
comparing the absolute value of the difference between the respective sum totals in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and judging the later frame among said two frames successive in the elapsing time t (first non) o be converted by said short blocks , when said absolute value is larger than said threshold value , and judging the later frame among said two frames successive in the elapsing time to be converted by said long block , when said absolute value is smaller than said threshold value .

EP1074976A2
CLAIM 5
Digital acoustic signal coding apparatus comprising means in which a digital acoustic signal is inputted along a time axis and divided into blocks therealong , processings including a sub-band division and conversion to frequency area are practiced per each of the respective block , said acoustic signal is divided into plural band widths , coded bits are allocated to each of said respective band widths , a normalized coefficient is obtained corresponding to the coded bit number of the allocated bits , and said digital acoustic signal is compressed and coded by quantizing said acoustic signal with said normalized coefficient , means for converting said acoustic signal divided into the blocks to either one of a long conversion block or plural short conversion blocks , when the conversion to said frequency area is performed ;
means for dividing said plural short conversion blocks into the groups of plural blocks respectively including one or plural short conversion blocks , when said short conversion blocks are employed ;
and means for quantizing said acoustic signal , causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said digital acoustic signal coding apparatus further comprising : sensation entropy calculation means (12) for calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
sensation entropy total sum calculation means (13) for obtaining said total sum in the frame of said sensation entropy calculated by said sensation entropy calculation medium (12) ;
comparison means (current frame) (14) for comparing the absolute value of the difference between the respective total sums in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and long/short blocks judgment means (15) for judging whether said long block or said short blocks should convert the block of said input acoustic signal on the basis of the comparison result obtained by said comparison medium .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (comparison means) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
EP1074976A2
CLAIM 3
A method of coding a digital acoustic signal comprising the steps of : inputting a digital acoustic signal along a time axis ;
dividing said digital acoustic signal into blocks therealong by use of a computer ;
practicing processings including a sub-band division or conversion to frequency area per each of the respective blocks ;
dividing said acoustic signal into plural band widths ;
allocating coded bits to each of said respective band widths ;
obtaining a normalized coefficient , corresponding to the coded bit number of the allocated bits ;
and compressing and coding said digital acoustic signal by quantizing said acoustic signal with said normalized coefficient , wherein , when the conversion to said frequency area is performed , said acoustic signal divided into the blocks is converted to either one of a long conversion block or plural short conversion blocks ;
wherein , when said short conversion blocks are employed , said plural short conversion blocks are divided into the groups of plural blocks respectively including one or plural short conversion blocks ;
and wherein said acoustic signal is practiced to quantize causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said method further comprising the steps of : calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
obtaining said sum total in the frame of said calculated sensation entropy ;
comparing the absolute value of the difference between the respective sum totals in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and judging the later frame among said two frames successive in the elapsing time t (first non) o be converted by said short blocks , when said absolute value is larger than said threshold value , and judging the later frame among said two frames successive in the elapsing time to be converted by said long block , when said absolute value is smaller than said threshold value .

EP1074976A2
CLAIM 5
Digital acoustic signal coding apparatus comprising means in which a digital acoustic signal is inputted along a time axis and divided into blocks therealong , processings including a sub-band division and conversion to frequency area are practiced per each of the respective block , said acoustic signal is divided into plural band widths , coded bits are allocated to each of said respective band widths , a normalized coefficient is obtained corresponding to the coded bit number of the allocated bits , and said digital acoustic signal is compressed and coded by quantizing said acoustic signal with said normalized coefficient , means for converting said acoustic signal divided into the blocks to either one of a long conversion block or plural short conversion blocks , when the conversion to said frequency area is performed ;
means for dividing said plural short conversion blocks into the groups of plural blocks respectively including one or plural short conversion blocks , when said short conversion blocks are employed ;
and means for quantizing said acoustic signal , causing one or plural short conversion block included in the same group to correspond to a common normalized coefficient ;
said digital acoustic signal coding apparatus further comprising : sensation entropy calculation means (12) for calculating the sensation entropy of an input acoustic signal calculated per each of said respective short conversion blocks ;
sensation entropy total sum calculation means (13) for obtaining said total sum in the frame of said sensation entropy calculated by said sensation entropy calculation medium (12) ;
comparison means (current frame) (14) for comparing the absolute value of the difference between the respective total sums in the frame of the sensation entropy of the two frames being successive in relation to the elapsing time with a previously determined threshold value ;
and long/short blocks judgment means (15) for judging whether said long block or said short blocks should convert the block of said input acoustic signal on the basis of the comparison result obtained by said comparison medium .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
EP1047047A2

Filed: 2000-03-23     Issued: 2000-10-25

Audio signal coding and decoding methods and apparatus and recording media with programs therefor

(Original Assignee) Nippon Telegraph and Telephone Corp     (Current Assignee) Nippon Telegraph and Telephone Corp

Kazuaki Nippon Telegraph/Telephone Corp. Chikira, Naoki Nippon Telegraph/Telephone Corp. Iwakami, Akio Nippon Telegraph/Telephone Corp. Jin, Takeshi Nippon Telegraph/Telephone Corp. Mori, Takehiro Nippon Telegraph/Telephone Corp. Moriya
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe (fixed number) affected by the artificial construction of the periodic part .
EP1047047A2
CLAIM 1
An audio signal coding method for coding input audio signal samples , said method comprising the steps of : (a) time-frequency transforming every fixed number (last subframe) of input audio signal samples into frequency-domain coefficients ;
(b) dividing said frequency-domain coefficients into coefficient segments each consisting of one or more coefficients to generate a sequence of coefficient segments ;
(c) calculating the intensity of each coefficient segment of said sequence of coefficient segments ;
(d) classifying the coefficient segments in the sequence into either one of at least two groups according to the intensities of said coefficient segments to generate at least two sequences of coefficient segments , and encoding and outputting classification information as a classification information code ;
and (e) encoding said at least two sequences of coefficient segments and outputting them as coefficient codes .

EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time transformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time transformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time transformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time transformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (time t) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time t (first non) ransformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (time t) erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (decoding apparatus) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time t (first non) ransformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (time t) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP1047047A2
CLAIM 31
A decoding apparatus which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time t (first non) ransformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time t (first non) ransformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non (time t) erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP1047047A2
CLAIM 31
A decoding apparatus which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time t (first non) ransformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time t (first non) ransformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs (decoding apparatus) , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe (fixed number) affected by the artificial construction of the periodic part .
EP1047047A2
CLAIM 1
An audio signal coding method for coding input audio signal samples , said method comprising the steps of : (a) time-frequency transforming every fixed number (last subframe) of input audio signal samples into frequency-domain coefficients ;
(b) dividing said frequency-domain coefficients into coefficient segments each consisting of one or more coefficients to generate a sequence of coefficient segments ;
(c) calculating the intensity of each coefficient segment of said sequence of coefficient segments ;
(d) classifying the coefficient segments in the sequence into either one of at least two groups according to the intensities of said coefficient segments to generate at least two sequences of coefficient segments , and encoding and outputting classification information as a classification information code ;
and (e) encoding said at least two sequences of coefficient segments and outputting them as coefficient codes .

EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time transformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time transformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time transformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time transformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (time t) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time t (first non) ransformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (time t) erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (decoding apparatus) , limits to a given value a gain used for scaling the synthesized sound signal .
EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time t (first non) ransformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (time t) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP1047047A2
CLAIM 31
A decoding apparatus which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time t (first non) ransformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time t (first non) ransformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non (time t) erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP1047047A2
CLAIM 31
A decoding apparatus which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time t (first non) ransformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery (decoding apparatus) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
EP1047047A2
CLAIM 31
A decoding apparatus (decoder recovery, decoder constructs) which receives input digital codes and outputs audio signal samples , the apparatus comprising : an inverse-quantization part for decoding said input digital codes into plural sequences of coefficient segments ;
a coefficient combining part for decoding said input digital codes to obtain classification information of said coefficient segments , and combining said plural sequences of coefficient segments based on said classification information to reconstruct a single sequence of frequency-domain coefficients sequentially arranged ;
and a frequency-time t (first non) ransformation part for frequency-time transforming the reconstructed frequency-domain coefficients into the time domain and outputting the resulting audio signal samples as an audio signal .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6377915B1

Filed: 2000-03-14     Issued: 2002-04-23

Speech decoding using mix ratio table

(Original Assignee) YRP Advanced Mobile Communication Systems Res Labs Co Ltd     (Current Assignee) YRP ADVANCED MOBILE COMMUNICATION SYSTEMS RESEARCH LABORATORIES Co Ltd ; YRP Advanced Mobile Communication Systems Res Labs Co Ltd

Seishi Sasaki
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (predetermined frequency) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6377915B1
CLAIM 1
. A speech decoding method for reproducing a speech signal from a speech information bit stream which is a coded output of the speech signal that has been encoded by a linear prediction analysis and synthesis type speech encoder , said speech decoding method comprising the steps of : separating spectral envelope information , voiced/unvoiced discriminating information , pitch period information and gain information from said speech information bit stream , whereby forming a plurality of separated informations , and decoding each separated information ;
obtaining a spectral envelope amplitude from said spectral envelope information , and identifying a frequency band having a largest spectral envelope amplitude among a predetermined number of frequency bands each having a predetermined frequency (placing remaining impulse responses) bandwidth divided on a frequency axis for generating a mixed excitation signal ;
determining a mixing ratio for each of said predetermined number of frequency bands , based on said identified frequency band and said voiced/unvoiced discriminating information and using said mixing ratio to mix a pitch pulse generated in response to said pitch period information and white noise with reference to a predetermined mixing ratio table that has previously been stored ;
producing a mixing signal for each of said predetermined number of frequency bands based on said determined mixing ratio , and then producing said mixed excitation signal by summing all of said mixing signals of said predetermined number of frequency bands ;
and producing a reproduced speech by adding said spectral envelope information and said gain information to said mixed excitation signal .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy (lowest frequency) for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6377915B1
CLAIM 1
. A speech decoding method for reproducing a speech signal (speech signal, decoder determines concealment) from a speech information bit stream which is a coded output of the speech signal that has been encoded by a linear prediction analysis and synthesis type speech encoder , said speech decoding method comprising the steps of : separating spectral envelope information , voiced/unvoiced discriminating information , pitch period information and gain information from said speech information bit stream , whereby forming a plurality of separated informations , and decoding each separated information ;
obtaining a spectral envelope amplitude from said spectral envelope information , and identifying a frequency band having a largest spectral envelope amplitude among a predetermined number of frequency bands each having a predetermined frequency bandwidth divided on a frequency axis for generating a mixed excitation signal ;
determining a mixing ratio for each of said predetermined number of frequency bands , based on said identified frequency band and said voiced/unvoiced discriminating information and using said mixing ratio to mix a pitch pulse generated in response to said pitch period information and white noise with reference to a predetermined mixing ratio table that has previously been stored ;
producing a mixing signal for each of said predetermined number of frequency bands based on said determined mixing ratio , and then producing said mixed excitation signal by summing all of said mixing signals of said predetermined number of frequency bands ;
and producing a reproduced speech by adding said spectral envelope information and said gain information to said mixed excitation signal .

US6377915B1
CLAIM 3
. The speech decoding method in accordance with claim 2 , wherein said predetermined number of high-frequency bands are separated into three frequency bands , and where said high-frequency band voiced/unvoiced discriminating information indicates a voiced state , setting said previously stored predetermined mixing ratio table in the following manner : when the spectral envelope amplitude is maximized in the first or second lowest frequency (signal energy, LP filter, LP filter excitation signal) band , the ratio of pitch pulse (hereinafter , referred to as “voicing strength”) monotonously decreases with increasing frequency of each of said predetermined number of high-frequency bands ;
and when the spectral envelope amplitude is maximized in the highest frequency band , the ratio of pitch pulse for the second lowest frequency band is smaller than the voicing strength for the first lowest frequency band while the voicing strength for the highest frequency band is larger than the ratio of pitch pulse for the second lowest frequency band .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6377915B1
CLAIM 1
. A speech decoding method for reproducing a speech signal from a speech information bit stream which is a coded output of the speech signal that has been encoded by a linear prediction analysis and synthesis type speech encoder (last frame, replacement frame) , said speech decoding method comprising the steps of : separating spectral envelope information , voiced/unvoiced discriminating information , pitch period information and gain information from said speech information bit stream , whereby forming a plurality of separated informations , and decoding each separated information ;
obtaining a spectral envelope amplitude from said spectral envelope information , and identifying a frequency band having a largest spectral envelope amplitude among a predetermined number of frequency bands each having a predetermined frequency bandwidth divided on a frequency axis for generating a mixed excitation signal ;
determining a mixing ratio for each of said predetermined number of frequency bands , based on said identified frequency band and said voiced/unvoiced discriminating information and using said mixing ratio to mix a pitch pulse generated in response to said pitch period information and white noise with reference to a predetermined mixing ratio table that has previously been stored ;
producing a mixing signal for each of said predetermined number of frequency bands based on said determined mixing ratio , and then producing said mixed excitation signal by summing all of said mixing signals of said predetermined number of frequency bands ;
and producing a reproduced speech by adding said spectral envelope information and said gain information to said mixed excitation signal .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6377915B1
CLAIM 1
. A speech decoding method for reproducing a speech signal (speech signal, decoder determines concealment) from a speech information bit stream which is a coded output of the speech signal that has been encoded by a linear prediction analysis and synthesis type speech encoder , said speech decoding method comprising the steps of : separating spectral envelope information , voiced/unvoiced discriminating information , pitch period information and gain information from said speech information bit stream , whereby forming a plurality of separated informations , and decoding each separated information ;
obtaining a spectral envelope amplitude from said spectral envelope information , and identifying a frequency band having a largest spectral envelope amplitude among a predetermined number of frequency bands each having a predetermined frequency bandwidth divided on a frequency axis for generating a mixed excitation signal ;
determining a mixing ratio for each of said predetermined number of frequency bands , based on said identified frequency band and said voiced/unvoiced discriminating information and using said mixing ratio to mix a pitch pulse generated in response to said pitch period information and white noise with reference to a predetermined mixing ratio table that has previously been stored ;
producing a mixing signal for each of said predetermined number of frequency bands based on said determined mixing ratio , and then producing said mixed excitation signal by summing all of said mixing signals of said predetermined number of frequency bands ;
and producing a reproduced speech by adding said spectral envelope information and said gain information to said mixed excitation signal .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6377915B1
CLAIM 1
. A speech decoding method for reproducing a speech signal (speech signal, decoder determines concealment) from a speech information bit stream which is a coded output of the speech signal that has been encoded by a linear prediction analysis and synthesis type speech encoder , said speech decoding method comprising the steps of : separating spectral envelope information , voiced/unvoiced discriminating information , pitch period information and gain information from said speech information bit stream , whereby forming a plurality of separated informations , and decoding each separated information ;
obtaining a spectral envelope amplitude from said spectral envelope information , and identifying a frequency band having a largest spectral envelope amplitude among a predetermined number of frequency bands each having a predetermined frequency bandwidth divided on a frequency axis for generating a mixed excitation signal ;
determining a mixing ratio for each of said predetermined number of frequency bands , based on said identified frequency band and said voiced/unvoiced discriminating information and using said mixing ratio to mix a pitch pulse generated in response to said pitch period information and white noise with reference to a predetermined mixing ratio table that has previously been stored ;
producing a mixing signal for each of said predetermined number of frequency bands based on said determined mixing ratio , and then producing said mixed excitation signal by summing all of said mixing signals of said predetermined number of frequency bands ;
and producing a reproduced speech by adding said spectral envelope information and said gain information to said mixed excitation signal .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (lowest frequency) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6377915B1
CLAIM 1
. A speech decoding method for reproducing a speech signal from a speech information bit stream which is a coded output of the speech signal that has been encoded by a linear prediction analysis and synthesis type speech encoder (last frame, replacement frame) , said speech decoding method comprising the steps of : separating spectral envelope information , voiced/unvoiced discriminating information , pitch period information and gain information from said speech information bit stream , whereby forming a plurality of separated informations , and decoding each separated information ;
obtaining a spectral envelope amplitude from said spectral envelope information , and identifying a frequency band having a largest spectral envelope amplitude among a predetermined number of frequency bands each having a predetermined frequency bandwidth divided on a frequency axis for generating a mixed excitation signal ;
determining a mixing ratio for each of said predetermined number of frequency bands , based on said identified frequency band and said voiced/unvoiced discriminating information and using said mixing ratio to mix a pitch pulse generated in response to said pitch period information and white noise with reference to a predetermined mixing ratio table that has previously been stored ;
producing a mixing signal for each of said predetermined number of frequency bands based on said determined mixing ratio , and then producing said mixed excitation signal by summing all of said mixing signals of said predetermined number of frequency bands ;
and producing a reproduced speech by adding said spectral envelope information and said gain information to said mixed excitation signal .

US6377915B1
CLAIM 3
. The speech decoding method in accordance with claim 2 , wherein said predetermined number of high-frequency bands are separated into three frequency bands , and where said high-frequency band voiced/unvoiced discriminating information indicates a voiced state , setting said previously stored predetermined mixing ratio table in the following manner : when the spectral envelope amplitude is maximized in the first or second lowest frequency (signal energy, LP filter, LP filter excitation signal) band , the ratio of pitch pulse (hereinafter , referred to as “voicing strength”) monotonously decreases with increasing frequency of each of said predetermined number of high-frequency bands ;
and when the spectral envelope amplitude is maximized in the highest frequency band , the ratio of pitch pulse for the second lowest frequency band is smaller than the voicing strength for the first lowest frequency band while the voicing strength for the highest frequency band is larger than the ratio of pitch pulse for the second lowest frequency band .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (lowest frequency) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6377915B1
CLAIM 3
. The speech decoding method in accordance with claim 2 , wherein said predetermined number of high-frequency bands are separated into three frequency bands , and where said high-frequency band voiced/unvoiced discriminating information indicates a voiced state , setting said previously stored predetermined mixing ratio table in the following manner : when the spectral envelope amplitude is maximized in the first or second lowest frequency (signal energy, LP filter, LP filter excitation signal) band , the ratio of pitch pulse (hereinafter , referred to as “voicing strength”) monotonously decreases with increasing frequency of each of said predetermined number of high-frequency bands ;
and when the spectral envelope amplitude is maximized in the highest frequency band , the ratio of pitch pulse for the second lowest frequency band is smaller than the voicing strength for the first lowest frequency band while the voicing strength for the highest frequency band is larger than the ratio of pitch pulse for the second lowest frequency band .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (lowest frequency) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6377915B1
CLAIM 1
. A speech decoding method for reproducing a speech signal from a speech information bit stream which is a coded output of the speech signal that has been encoded by a linear prediction analysis and synthesis type speech encoder (last frame, replacement frame) , said speech decoding method comprising the steps of : separating spectral envelope information , voiced/unvoiced discriminating information , pitch period information and gain information from said speech information bit stream , whereby forming a plurality of separated informations , and decoding each separated information ;
obtaining a spectral envelope amplitude from said spectral envelope information , and identifying a frequency band having a largest spectral envelope amplitude among a predetermined number of frequency bands each having a predetermined frequency bandwidth divided on a frequency axis for generating a mixed excitation signal ;
determining a mixing ratio for each of said predetermined number of frequency bands , based on said identified frequency band and said voiced/unvoiced discriminating information and using said mixing ratio to mix a pitch pulse generated in response to said pitch period information and white noise with reference to a predetermined mixing ratio table that has previously been stored ;
producing a mixing signal for each of said predetermined number of frequency bands based on said determined mixing ratio , and then producing said mixed excitation signal by summing all of said mixing signals of said predetermined number of frequency bands ;
and producing a reproduced speech by adding said spectral envelope information and said gain information to said mixed excitation signal .

US6377915B1
CLAIM 3
. The speech decoding method in accordance with claim 2 , wherein said predetermined number of high-frequency bands are separated into three frequency bands , and where said high-frequency band voiced/unvoiced discriminating information indicates a voiced state , setting said previously stored predetermined mixing ratio table in the following manner : when the spectral envelope amplitude is maximized in the first or second lowest frequency (signal energy, LP filter, LP filter excitation signal) band , the ratio of pitch pulse (hereinafter , referred to as “voicing strength”) monotonously decreases with increasing frequency of each of said predetermined number of high-frequency bands ;
and when the spectral envelope amplitude is maximized in the highest frequency band , the ratio of pitch pulse for the second lowest frequency band is smaller than the voicing strength for the first lowest frequency band while the voicing strength for the highest frequency band is larger than the ratio of pitch pulse for the second lowest frequency band .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (predetermined frequency) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6377915B1
CLAIM 1
. A speech decoding method for reproducing a speech signal from a speech information bit stream which is a coded output of the speech signal that has been encoded by a linear prediction analysis and synthesis type speech encoder , said speech decoding method comprising the steps of : separating spectral envelope information , voiced/unvoiced discriminating information , pitch period information and gain information from said speech information bit stream , whereby forming a plurality of separated informations , and decoding each separated information ;
obtaining a spectral envelope amplitude from said spectral envelope information , and identifying a frequency band having a largest spectral envelope amplitude among a predetermined number of frequency bands each having a predetermined frequency (placing remaining impulse responses) bandwidth divided on a frequency axis for generating a mixed excitation signal ;
determining a mixing ratio for each of said predetermined number of frequency bands , based on said identified frequency band and said voiced/unvoiced discriminating information and using said mixing ratio to mix a pitch pulse generated in response to said pitch period information and white noise with reference to a predetermined mixing ratio table that has previously been stored ;
producing a mixing signal for each of said predetermined number of frequency bands based on said determined mixing ratio , and then producing said mixed excitation signal by summing all of said mixing signals of said predetermined number of frequency bands ;
and producing a reproduced speech by adding said spectral envelope information and said gain information to said mixed excitation signal .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (lowest frequency) for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6377915B1
CLAIM 1
. A speech decoding method for reproducing a speech signal (speech signal, decoder determines concealment) from a speech information bit stream which is a coded output of the speech signal that has been encoded by a linear prediction analysis and synthesis type speech encoder , said speech decoding method comprising the steps of : separating spectral envelope information , voiced/unvoiced discriminating information , pitch period information and gain information from said speech information bit stream , whereby forming a plurality of separated informations , and decoding each separated information ;
obtaining a spectral envelope amplitude from said spectral envelope information , and identifying a frequency band having a largest spectral envelope amplitude among a predetermined number of frequency bands each having a predetermined frequency bandwidth divided on a frequency axis for generating a mixed excitation signal ;
determining a mixing ratio for each of said predetermined number of frequency bands , based on said identified frequency band and said voiced/unvoiced discriminating information and using said mixing ratio to mix a pitch pulse generated in response to said pitch period information and white noise with reference to a predetermined mixing ratio table that has previously been stored ;
producing a mixing signal for each of said predetermined number of frequency bands based on said determined mixing ratio , and then producing said mixed excitation signal by summing all of said mixing signals of said predetermined number of frequency bands ;
and producing a reproduced speech by adding said spectral envelope information and said gain information to said mixed excitation signal .

US6377915B1
CLAIM 3
. The speech decoding method in accordance with claim 2 , wherein said predetermined number of high-frequency bands are separated into three frequency bands , and where said high-frequency band voiced/unvoiced discriminating information indicates a voiced state , setting said previously stored predetermined mixing ratio table in the following manner : when the spectral envelope amplitude is maximized in the first or second lowest frequency (signal energy, LP filter, LP filter excitation signal) band , the ratio of pitch pulse (hereinafter , referred to as “voicing strength”) monotonously decreases with increasing frequency of each of said predetermined number of high-frequency bands ;
and when the spectral envelope amplitude is maximized in the highest frequency band , the ratio of pitch pulse for the second lowest frequency band is smaller than the voicing strength for the first lowest frequency band while the voicing strength for the highest frequency band is larger than the ratio of pitch pulse for the second lowest frequency band .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6377915B1
CLAIM 1
. A speech decoding method for reproducing a speech signal from a speech information bit stream which is a coded output of the speech signal that has been encoded by a linear prediction analysis and synthesis type speech encoder (last frame, replacement frame) , said speech decoding method comprising the steps of : separating spectral envelope information , voiced/unvoiced discriminating information , pitch period information and gain information from said speech information bit stream , whereby forming a plurality of separated informations , and decoding each separated information ;
obtaining a spectral envelope amplitude from said spectral envelope information , and identifying a frequency band having a largest spectral envelope amplitude among a predetermined number of frequency bands each having a predetermined frequency bandwidth divided on a frequency axis for generating a mixed excitation signal ;
determining a mixing ratio for each of said predetermined number of frequency bands , based on said identified frequency band and said voiced/unvoiced discriminating information and using said mixing ratio to mix a pitch pulse generated in response to said pitch period information and white noise with reference to a predetermined mixing ratio table that has previously been stored ;
producing a mixing signal for each of said predetermined number of frequency bands based on said determined mixing ratio , and then producing said mixed excitation signal by summing all of said mixing signals of said predetermined number of frequency bands ;
and producing a reproduced speech by adding said spectral envelope information and said gain information to said mixed excitation signal .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6377915B1
CLAIM 1
. A speech decoding method for reproducing a speech signal (speech signal, decoder determines concealment) from a speech information bit stream which is a coded output of the speech signal that has been encoded by a linear prediction analysis and synthesis type speech encoder , said speech decoding method comprising the steps of : separating spectral envelope information , voiced/unvoiced discriminating information , pitch period information and gain information from said speech information bit stream , whereby forming a plurality of separated informations , and decoding each separated information ;
obtaining a spectral envelope amplitude from said spectral envelope information , and identifying a frequency band having a largest spectral envelope amplitude among a predetermined number of frequency bands each having a predetermined frequency bandwidth divided on a frequency axis for generating a mixed excitation signal ;
determining a mixing ratio for each of said predetermined number of frequency bands , based on said identified frequency band and said voiced/unvoiced discriminating information and using said mixing ratio to mix a pitch pulse generated in response to said pitch period information and white noise with reference to a predetermined mixing ratio table that has previously been stored ;
producing a mixing signal for each of said predetermined number of frequency bands based on said determined mixing ratio , and then producing said mixed excitation signal by summing all of said mixing signals of said predetermined number of frequency bands ;
and producing a reproduced speech by adding said spectral envelope information and said gain information to said mixed excitation signal .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6377915B1
CLAIM 1
. A speech decoding method for reproducing a speech signal (speech signal, decoder determines concealment) from a speech information bit stream which is a coded output of the speech signal that has been encoded by a linear prediction analysis and synthesis type speech encoder , said speech decoding method comprising the steps of : separating spectral envelope information , voiced/unvoiced discriminating information , pitch period information and gain information from said speech information bit stream , whereby forming a plurality of separated informations , and decoding each separated information ;
obtaining a spectral envelope amplitude from said spectral envelope information , and identifying a frequency band having a largest spectral envelope amplitude among a predetermined number of frequency bands each having a predetermined frequency bandwidth divided on a frequency axis for generating a mixed excitation signal ;
determining a mixing ratio for each of said predetermined number of frequency bands , based on said identified frequency band and said voiced/unvoiced discriminating information and using said mixing ratio to mix a pitch pulse generated in response to said pitch period information and white noise with reference to a predetermined mixing ratio table that has previously been stored ;
producing a mixing signal for each of said predetermined number of frequency bands based on said determined mixing ratio , and then producing said mixed excitation signal by summing all of said mixing signals of said predetermined number of frequency bands ;
and producing a reproduced speech by adding said spectral envelope information and said gain information to said mixed excitation signal .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (lowest frequency) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6377915B1
CLAIM 1
. A speech decoding method for reproducing a speech signal from a speech information bit stream which is a coded output of the speech signal that has been encoded by a linear prediction analysis and synthesis type speech encoder (last frame, replacement frame) , said speech decoding method comprising the steps of : separating spectral envelope information , voiced/unvoiced discriminating information , pitch period information and gain information from said speech information bit stream , whereby forming a plurality of separated informations , and decoding each separated information ;
obtaining a spectral envelope amplitude from said spectral envelope information , and identifying a frequency band having a largest spectral envelope amplitude among a predetermined number of frequency bands each having a predetermined frequency bandwidth divided on a frequency axis for generating a mixed excitation signal ;
determining a mixing ratio for each of said predetermined number of frequency bands , based on said identified frequency band and said voiced/unvoiced discriminating information and using said mixing ratio to mix a pitch pulse generated in response to said pitch period information and white noise with reference to a predetermined mixing ratio table that has previously been stored ;
producing a mixing signal for each of said predetermined number of frequency bands based on said determined mixing ratio , and then producing said mixed excitation signal by summing all of said mixing signals of said predetermined number of frequency bands ;
and producing a reproduced speech by adding said spectral envelope information and said gain information to said mixed excitation signal .

US6377915B1
CLAIM 3
. The speech decoding method in accordance with claim 2 , wherein said predetermined number of high-frequency bands are separated into three frequency bands , and where said high-frequency band voiced/unvoiced discriminating information indicates a voiced state , setting said previously stored predetermined mixing ratio table in the following manner : when the spectral envelope amplitude is maximized in the first or second lowest frequency (signal energy, LP filter, LP filter excitation signal) band , the ratio of pitch pulse (hereinafter , referred to as “voicing strength”) monotonously decreases with increasing frequency of each of said predetermined number of high-frequency bands ;
and when the spectral envelope amplitude is maximized in the highest frequency band , the ratio of pitch pulse for the second lowest frequency band is smaller than the voicing strength for the first lowest frequency band while the voicing strength for the highest frequency band is larger than the ratio of pitch pulse for the second lowest frequency band .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (lowest frequency) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6377915B1
CLAIM 3
. The speech decoding method in accordance with claim 2 , wherein said predetermined number of high-frequency bands are separated into three frequency bands , and where said high-frequency band voiced/unvoiced discriminating information indicates a voiced state , setting said previously stored predetermined mixing ratio table in the following manner : when the spectral envelope amplitude is maximized in the first or second lowest frequency (signal energy, LP filter, LP filter excitation signal) band , the ratio of pitch pulse (hereinafter , referred to as “voicing strength”) monotonously decreases with increasing frequency of each of said predetermined number of high-frequency bands ;
and when the spectral envelope amplitude is maximized in the highest frequency band , the ratio of pitch pulse for the second lowest frequency band is smaller than the voicing strength for the first lowest frequency band while the voicing strength for the highest frequency band is larger than the ratio of pitch pulse for the second lowest frequency band .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (lowest frequency) for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6377915B1
CLAIM 1
. A speech decoding method for reproducing a speech signal (speech signal, decoder determines concealment) from a speech information bit stream which is a coded output of the speech signal that has been encoded by a linear prediction analysis and synthesis type speech encoder , said speech decoding method comprising the steps of : separating spectral envelope information , voiced/unvoiced discriminating information , pitch period information and gain information from said speech information bit stream , whereby forming a plurality of separated informations , and decoding each separated information ;
obtaining a spectral envelope amplitude from said spectral envelope information , and identifying a frequency band having a largest spectral envelope amplitude among a predetermined number of frequency bands each having a predetermined frequency bandwidth divided on a frequency axis for generating a mixed excitation signal ;
determining a mixing ratio for each of said predetermined number of frequency bands , based on said identified frequency band and said voiced/unvoiced discriminating information and using said mixing ratio to mix a pitch pulse generated in response to said pitch period information and white noise with reference to a predetermined mixing ratio table that has previously been stored ;
producing a mixing signal for each of said predetermined number of frequency bands based on said determined mixing ratio , and then producing said mixed excitation signal by summing all of said mixing signals of said predetermined number of frequency bands ;
and producing a reproduced speech by adding said spectral envelope information and said gain information to said mixed excitation signal .

US6377915B1
CLAIM 3
. The speech decoding method in accordance with claim 2 , wherein said predetermined number of high-frequency bands are separated into three frequency bands , and where said high-frequency band voiced/unvoiced discriminating information indicates a voiced state , setting said previously stored predetermined mixing ratio table in the following manner : when the spectral envelope amplitude is maximized in the first or second lowest frequency (signal energy, LP filter, LP filter excitation signal) band , the ratio of pitch pulse (hereinafter , referred to as “voicing strength”) monotonously decreases with increasing frequency of each of said predetermined number of high-frequency bands ;
and when the spectral envelope amplitude is maximized in the highest frequency band , the ratio of pitch pulse for the second lowest frequency band is smaller than the voicing strength for the first lowest frequency band while the voicing strength for the highest frequency band is larger than the ratio of pitch pulse for the second lowest frequency band .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (lowest frequency) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6377915B1
CLAIM 1
. A speech decoding method for reproducing a speech signal from a speech information bit stream which is a coded output of the speech signal that has been encoded by a linear prediction analysis and synthesis type speech encoder (last frame, replacement frame) , said speech decoding method comprising the steps of : separating spectral envelope information , voiced/unvoiced discriminating information , pitch period information and gain information from said speech information bit stream , whereby forming a plurality of separated informations , and decoding each separated information ;
obtaining a spectral envelope amplitude from said spectral envelope information , and identifying a frequency band having a largest spectral envelope amplitude among a predetermined number of frequency bands each having a predetermined frequency bandwidth divided on a frequency axis for generating a mixed excitation signal ;
determining a mixing ratio for each of said predetermined number of frequency bands , based on said identified frequency band and said voiced/unvoiced discriminating information and using said mixing ratio to mix a pitch pulse generated in response to said pitch period information and white noise with reference to a predetermined mixing ratio table that has previously been stored ;
producing a mixing signal for each of said predetermined number of frequency bands based on said determined mixing ratio , and then producing said mixed excitation signal by summing all of said mixing signals of said predetermined number of frequency bands ;
and producing a reproduced speech by adding said spectral envelope information and said gain information to said mixed excitation signal .

US6377915B1
CLAIM 3
. The speech decoding method in accordance with claim 2 , wherein said predetermined number of high-frequency bands are separated into three frequency bands , and where said high-frequency band voiced/unvoiced discriminating information indicates a voiced state , setting said previously stored predetermined mixing ratio table in the following manner : when the spectral envelope amplitude is maximized in the first or second lowest frequency (signal energy, LP filter, LP filter excitation signal) band , the ratio of pitch pulse (hereinafter , referred to as “voicing strength”) monotonously decreases with increasing frequency of each of said predetermined number of high-frequency bands ;
and when the spectral envelope amplitude is maximized in the highest frequency band , the ratio of pitch pulse for the second lowest frequency band is smaller than the voicing strength for the first lowest frequency band while the voicing strength for the highest frequency band is larger than the ratio of pitch pulse for the second lowest frequency band .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JP2001166800A

Filed: 1999-12-09     Issued: 2001-06-22

音声符号化方法及び音声復号化方法

(Original Assignee) Nippon Telegr & Teleph Corp <Ntt>; 日本電信電話株式会社     

Yuusuke Hiwazaki, Kazunori Mano, 祐介 日和▲崎▼, 一則 間野
US7693710B2
CLAIM 1
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (周期性) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (の平均) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性 (pitch period) を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均 (first impulse) パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 2
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 3
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (周期性) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性 (pitch period) を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 4
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声信号, 音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 5
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (フレームごと) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (フレームごと) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 8
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (フレームごと) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2001166800A
CLAIM 1
【請求項1】音声信号をフレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

US7693710B2
CLAIM 10
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 11
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (周期性) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性 (pitch period) を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (フレームごと) caused by frames erased during transmission of a sound signal (音声信号, 音声復号) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (周期性) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (の平均) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性 (pitch period) を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均 (first impulse) パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (周期性) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性 (pitch period) を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (フレームごと) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (フレームごと) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (フレームごと) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2001166800A
CLAIM 1
【請求項1】音声信号をフレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (周期性) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性 (pitch period) を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (フレームごと) caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (フレームごと) caused by frames erased during transmission of a sound signal (音声信号, 音声復号) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JP2001166800A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) フレームごと (frame erasure, concealing frame erasure) に線形予測分析し て線形予測分析係数を求め、前記線形予測分析係数に基 づくフィルタ係数を用いた線形予測合成フィルタを駆動 して得られた残差信号の特徴量を量子化した符号を決定 する音声符号化方法であって、 前記特徴量として、前記残差信号の周期性を判定し、前 記周期性が予め定められた閾値より低い場合、前記残差 信号を低域成分と高域成分とに帯域分割し、 前記低域成分との距離が最小となる雑音符号ベクトルに 対応する雑音符号を選択し、 前記高域成分は前記フレームを構成するサブフレームご との平均パワーを算出することを特徴とする音声符号化 方法。

JP2001166800A
CLAIM 4
【請求項4】線形予測合成フィルタを励振源で駆動して 音声信号を復号化する音声復号 (sound signal, speech signal) 化方法であって、 請求項1に記載の音声符号化方法により生成された雑音 符号と平均パワーを入力し、 低域成分の波形は、前記雑音符号に基づいて、符号帳か ら復号し、 高域成分の波形は、高域通過フィルタを通した白色雑音 を、量子化された前記平均パワーを元にサブフレーム毎 に利得を乗じて合成し、これらの二つの帯域の波形を足 し合わせて、線形予測合成フィルタの励振源とすること を特徴とする音声復号化方法。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6393390B1

Filed: 1999-12-06     Issued: 2002-05-21

LPAS speech coder using vector quantized, multi-codebook, multi-tap pitch predictor and optimized ternary source excitation codebook derivation

(Original Assignee) DSP Software Engineering Inc     (Current Assignee) Telecom Holding Parent LLC

Jayesh S. Patel, Douglas E. Kolb
US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link (loop manner) for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6393390B1
CLAIM 16
. The system is claimed in claim 15 wherein the corresponding vector values are derived in an open loop manner (communication link) .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (loop manner) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6393390B1
CLAIM 16
. The system is claimed in claim 15 wherein the corresponding vector values are derived in an open loop manner (communication link) .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (loop manner) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6393390B1
CLAIM 16
. The system is claimed in claim 15 wherein the corresponding vector values are derived in an open loop manner (communication link) .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (loop manner) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6393390B1
CLAIM 16
. The system is claimed in claim 15 wherein the corresponding vector values are derived in an open loop manner (communication link) .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (loop manner) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6393390B1
CLAIM 16
. The system is claimed in claim 15 wherein the corresponding vector values are derived in an open loop manner (communication link) .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (loop manner) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6393390B1
CLAIM 16
. The system is claimed in claim 15 wherein the corresponding vector values are derived in an open loop manner (communication link) .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (loop manner) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6393390B1
CLAIM 16
. The system is claimed in claim 15 wherein the corresponding vector values are derived in an open loop manner (communication link) .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (loop manner) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6393390B1
CLAIM 16
. The system is claimed in claim 15 wherein the corresponding vector values are derived in an open loop manner (communication link) .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (loop manner) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6393390B1
CLAIM 16
. The system is claimed in claim 15 wherein the corresponding vector values are derived in an open loop manner (communication link) .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JP2001144733A

Filed: 1999-11-15     Issued: 2001-05-25

音声伝送装置及び音声伝送方法

(Original Assignee) Nec Corp; Nec Viewtechnology Ltd; エヌイーシービューテクノロジー株式会社; 日本電気株式会社     

Masayuki Kitagawa, 真幸 北川
US7693710B2
CLAIM 1
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (フレーム番号, エラー) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (PSK) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 4
【請求項4】 前記変調信号が、QPSK (onset frame) 変調方式によ って生成されることを特徴とする請求項3記載の音声伝 送装置。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 2
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (フレーム番号, エラー) and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 3
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (フレーム番号, エラー) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 2
【請求項2】 前記音声データがステレオ音声データか らなり、前記多重化音声情報が、4チャネルの音声デー タと、フレーム番号情報と、誤り訂正用データと、制御 用データとを多重化したフォーマットを有すること (maximum amplitude) を特 徴とする請求項1記載の音声伝送装置。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 4
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (フレーム番号, エラー) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声情報, 前記送信) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報 (speech signal) を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 7
【請求項7】 送信側において、2チャネルの同一の音 声データにおける一方のチャネルの音声データを遅延す るとともに他方のチャネルの音声データと多重して伝送 し、受信側において、受信した前記一方のチャネルの音 声データにおけるエラーの有無に応じて、前記他方のチ ャネルの音声データを前記送信 (speech signal) 側における一方のチャネ ルの音声データの遅延時間と等しい時間遅延したデータ 又は前記遅延した一方のチャネルの音声データを出力す ることを特徴とする音声伝送方法。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 5
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (フレーム番号, エラー) and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (音声情報, 前記送信) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (フレームごと) is classified as onset , conducting frame erasure concealment (フレーム番号, エラー) and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報 (speech signal) を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 7
【請求項7】 送信側において、2チャネルの同一の音 声データにおける一方のチャネルの音声データを遅延す るとともに他方のチャネルの音声データと多重して伝送 し、受信側において、受信した前記一方のチャネルの音 声データにおけるエラーの有無に応じて、前記他方のチ ャネルの音声データを前記送信 (speech signal) 側における一方のチャネ ルの音声データの遅延時間と等しい時間遅延したデータ 又は前記遅延した一方のチャネルの音声データを出力す ることを特徴とする音声伝送方法。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (音声情報, 前記送信) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (フレームごと) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報 (speech signal) を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラーを検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 7
【請求項7】 送信側において、2チャネルの同一の音 声データにおける一方のチャネルの音声データを遅延す るとともに他方のチャネルの音声データと多重して伝送 し、受信側において、受信した前記一方のチャネルの音 声データにおけるエラーの有無に応じて、前記他方のチ ャネルの音声データを前記送信 (speech signal) 側における一方のチャネ ルの音声データの遅延時間と等しい時間遅延したデータ 又は前記遅延した一方のチャネルの音声データを出力す ることを特徴とする音声伝送方法。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 8
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (フレーム番号, エラー) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (フレームごと) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 10
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 11
. A method of concealing frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2001144733A
CLAIM 2
【請求項2】 前記音声データがステレオ音声データか らなり、前記多重化音声情報が、4チャネルの音声デー タと、フレーム番号情報と、誤り訂正用データと、制御 用データとを多重化したフォーマットを有すること (maximum amplitude) を特 徴とする請求項1記載の音声伝送装置。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (フレームごと) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment (フレーム番号, エラー) and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 13
. A device for conducting concealment (フレーム番号, エラー) of frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment (フレーム番号, エラー) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (PSK) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 4
【請求項4】 前記変調信号が、QPSK (onset frame) 変調方式によ って生成されることを特徴とする請求項3記載の音声伝 送装置。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 14
. A device for conducting concealment (フレーム番号, エラー) of frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (フレーム番号, エラー) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 15
. A device for conducting concealment (フレーム番号, エラー) of frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (フレーム番号, エラー) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 2
【請求項2】 前記音声データがステレオ音声データか らなり、前記多重化音声情報が、4チャネルの音声デー タと、フレーム番号情報と、誤り訂正用データと、制御 用データとを多重化したフォーマットを有すること (maximum amplitude) を特 徴とする請求項1記載の音声伝送装置。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 16
. A device for conducting concealment (フレーム番号, エラー) of frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (フレーム番号, エラー) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声情報, 前記送信) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報 (speech signal) を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 7
【請求項7】 送信側において、2チャネルの同一の音 声データにおける一方のチャネルの音声データを遅延す るとともに他方のチャネルの音声データと多重して伝送 し、受信側において、受信した前記一方のチャネルの音 声データにおけるエラーの有無に応じて、前記他方のチ ャネルの音声データを前記送信 (speech signal) 側における一方のチャネ ルの音声データの遅延時間と等しい時間遅延したデータ 又は前記遅延した一方のチャネルの音声データを出力す ることを特徴とする音声伝送方法。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 17
. A device for conducting concealment (フレーム番号, エラー) of frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (フレーム番号, エラー) and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (音声情報, 前記送信) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (フレームごと) is classified as onset , the decoder , for conducting frame erasure concealment (フレーム番号, エラー) and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報 (speech signal) を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 7
【請求項7】 送信側において、2チャネルの同一の音 声データにおける一方のチャネルの音声データを遅延す るとともに他方のチャネルの音声データと多重して伝送 し、受信側において、受信した前記一方のチャネルの音 声データにおけるエラーの有無に応じて、前記他方のチ ャネルの音声データを前記送信 (speech signal) 側における一方のチャネ ルの音声データの遅延時間と等しい時間遅延したデータ 又は前記遅延した一方のチャネルの音声データを出力す ることを特徴とする音声伝送方法。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (音声情報, 前記送信) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (フレームごと) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報 (speech signal) を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラーを検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 7
【請求項7】 送信側において、2チャネルの同一の音 声データにおける一方のチャネルの音声データを遅延す るとともに他方のチャネルの音声データと多重して伝送 し、受信側において、受信した前記一方のチャネルの音 声データにおけるエラーの有無に応じて、前記他方のチ ャネルの音声データを前記送信 (speech signal) 側における一方のチャネ ルの音声データの遅延時間と等しい時間遅延したデータ 又は前記遅延した一方のチャネルの音声データを出力す ることを特徴とする音声伝送方法。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 20
. A device for conducting concealment (フレーム番号, エラー) of frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (フレーム番号, エラー) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (フレームごと) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 22
. A device for conducting concealment (フレーム番号, エラー) of frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 23
. A device for conducting concealment (フレーム番号, エラー) of frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 2
【請求項2】 前記音声データがステレオ音声データか らなり、前記多重化音声情報が、4チャネルの音声デー タと、フレーム番号情報と、誤り訂正用データと、制御 用データとを多重化したフォーマットを有すること (maximum amplitude) を特 徴とする請求項1記載の音声伝送装置。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 24
. A device for conducting concealment (フレーム番号, エラー) of frame erasure (フレームごと) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声情報, 前記送信) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報 (speech signal) を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 7
【請求項7】 送信側において、2チャネルの同一の音 声データにおける一方のチャネルの音声データを遅延す るとともに他方のチャネルの音声データと多重して伝送 し、受信側において、受信した前記一方のチャネルの音 声データにおけるエラーの有無に応じて、前記他方のチ ャネルの音声データを前記送信 (speech signal) 側における一方のチャネ ルの音声データの遅延時間と等しい時間遅延したデータ 又は前記遅延した一方のチャネルの音声データを出力す ることを特徴とする音声伝送方法。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (フレームごと) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (フレーム番号, エラー) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment (フレーム番号, エラー) and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JP2001144733A
CLAIM 1
【請求項1】 2チャネルのフレーム化された同一の音 声データに対するフレーム番号 (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を発生するフレーム番号 発生手段と、一方のチャネルの音声データを蓄積する遅 延用記憶手段と、該蓄積された一方のチャネルの音声デ ータと、他方のチャネルの音声データと、フレーム番号 情報とを多重して多重化音声情報を生成する多重手段と を送信部に備え、 前記多重化音声情報における前記一方のチャネルの音声 データにおけるエラー (frame erasure concealment, frame concealment, decoder conducts frame erasure concealment, decoder concealment, conducting concealment) を検出する誤り検出手段と、該エ ラー検出時の前記一方のチャネルの音声データのフレー ム番号を検出するフレームチェック手段と、 前記多重化音声情報を分離する分離手段と、分離された 前記他方のチャネルの音声データを蓄積する記憶手段 と、前記エラーが検出されないとき分離された前記一方 のチャネルの音声データを選択し、エラーが検出された とき前記記憶手段からフレームチェック手段で検出され たフレーム番号に対応する他方のチャネルの音声データ を選択して出力するスイッチ手段とを受信部に備えて い ることを特徴とする音声伝送装置。

JP2001144733A
CLAIM 8
【請求項8】 送信側において、送信入力における一方 のチャネルの音声データと他方のチャネルの音声データ とにフレームごと (frame erasure, concealing frame erasure) に同じ番号を付与し、前記一方のチャ ネルの音声データを遅延用記憶手段に蓄積して出力する ことによって遅延させるとともに、受信側において、前 記他方のチャネルの音声データを記憶手段に蓄積し、前 記一方のチャネルの音声データにおいてエラーを検出し たとき、蓄積されている前記他方のチャネルの音声デー タにおける前記エラー検出時のフレーム番号の音声デー タを出力することによって、前記送信側の遅延時間と受 信側の遅延時間とを等しくすることを特徴とする請求項 7記載の音声伝送方法。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JP2001117573A

Filed: 1999-10-20     Issued: 2001-04-27

音声スペクトル強調方法/装置及び音声復号化装置

(Original Assignee) Toshiba Corp; 株式会社東芝     

Kimio Miseki, Masahiro Oshikiri, 公生 三関, 正浩 押切
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 2
【請求項2】音声信号の振幅スペクトル概形を求める手 段と、 前記振幅スペクトル概形の凸部周波数及び凹部周波数を 求める手段と、 前記凸部周波数及び凹部周波数から凸部周波数及び凹部 周波数をそれぞれ含む凸部帯域及び凹部帯域を決定する 手段と、 前記凸部帯域に含まれる周波数成分の振幅スペクトルを 強調し、前記凹部帯域に含まれる周波数成分の振幅スペ クトルを減衰させる特性を有するフィルタを構成して、 該フィルタにより前記音声信号をフィルタリングする手 段とを有すること (maximum amplitude) を特徴とする音声スペクトル強調装 置。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声信号, 音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 2
【請求項2】音声信号の振幅スペクトル概形を求める手 段と、 前記振幅スペクトル概形の凸部周波数及び凹部周波数を 求める手段と、 前記凸部周波数及び凹部周波数から凸部周波数及び凹部 周波数をそれぞれ含む凸部帯域及び凹部帯域を決定する 手段と、 前記凸部帯域に含まれる周波数成分の振幅スペクトルを 強調し、前記凹部帯域に含まれる周波数成分の振幅スペ クトルを減衰させる特性を有するフィルタを構成して、 該フィルタにより前記音声信号をフィルタリングする手 段とを有すること (maximum amplitude) を特徴とする音声スペクトル強調装 置。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号, 音声復号) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 2
【請求項2】音声信号の振幅スペクトル概形を求める手 段と、 前記振幅スペクトル概形の凸部周波数及び凹部周波数を 求める手段と、 前記凸部周波数及び凹部周波数から凸部周波数及び凹部 周波数をそれぞれ含む凸部帯域及び凹部帯域を決定する 手段と、 前記凸部帯域に含まれる周波数成分の振幅スペクトルを 強調し、前記凹部帯域に含まれる周波数成分の振幅スペ クトルを減衰させる特性を有するフィルタを構成して、 該フィルタにより前記音声信号をフィルタリングする手 段とを有すること (maximum amplitude) を特徴とする音声スペクトル強調装 置。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 2
【請求項2】音声信号の振幅スペクトル概形を求める手 段と、 前記振幅スペクトル概形の凸部周波数及び凹部周波数を 求める手段と、 前記凸部周波数及び凹部周波数から凸部周波数及び凹部 周波数をそれぞれ含む凸部帯域及び凹部帯域を決定する 手段と、 前記凸部帯域に含まれる周波数成分の振幅スペクトルを 強調し、前記凹部帯域に含まれる周波数成分の振幅スペ クトルを減衰させる特性を有するフィルタを構成して、 該フィルタにより前記音声信号をフィルタリングする手 段とを有すること (maximum amplitude) を特徴とする音声スペクトル強調装 置。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号, 音声復号) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JP2001117573A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の振幅スペクトル概形の凸部周波 数及び凹部周波数をそれぞれ含む凸部帯域及び凹部帯域 を決定し、凸部帯域に含まれる周波数成分の振幅スペク トルを強調し、凹部帯域に含まれる周波数成分の振幅ス ペクトルを減衰させる特性を有するフィルタを構成し て、該フィルタにより音声信号をフィルタリングするこ とを特徴とする音声スペクトル強調方法。

JP2001117573A
CLAIM 9
【請求項9】音声信号の符号化データを復号して復号音 声信号及び少なくとも音声信号の振幅スペクトルの情報 を含むパラメータを出力する音声復号 (sound signal, speech signal) 部と、 前記音声復号部からの復号音声信号及び前記パラメータ を入力する請求項2乃至8のいずれか1項記載の音声ス ペクトル強調装置により構成されるスペクトル強調部と を有し、 前記スペクトル強調部は、前記パラメータから前記スペ クトル概形を求め、該復号音声信号について前記フィル タリング処理を行うことを特徴とする音声復号化装置。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JP2001051698A

Filed: 1999-08-06     Issued: 2001-02-23

音声符号化復号方法および装置

(Original Assignee) Yrp Kokino Idotai Tsushin Kenkyusho:Kk; 株式会社ワイ・アール・ピー高機能移動体通信研究所     

Seiji Sasaki, 誠司 佐々木
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (符号化器) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (生成器, 音発生) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (符号化器) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (符号化器) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (符号化器) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声復号方法, 音声信号, 音声情報) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (符号化器) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (拡散処理) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (音声復号方法, 音声信号, 音声情報) is a speech signal (音声復号方法, 音声信号, 音声情報) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (拡散処理) erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (符号化器) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (音声復号方法, 音声信号, 音声情報) is a speech signal (音声復号方法, 音声信号, 音声情報) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (拡散処理) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (符号化器) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (ローパス, バンド) of a first non (拡散処理) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声復号方法, 音声信号, 音声情報) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (符号化器) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (ローパス, バンド) of a first non (拡散処理) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (符号化器) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs (符号化器) , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (生成器, 音発生) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (符号化器) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (符号化器) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (符号化器) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声復号方法, 音声信号, 音声情報) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (符号化器) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (拡散処理) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (音声復号方法, 音声信号, 音声情報) is a speech signal (音声復号方法, 音声信号, 音声情報) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (拡散処理) erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (符号化器) , limits to a given value a gain used for scaling the synthesized sound signal .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (音声復号方法, 音声信号, 音声情報) is a speech signal (音声復号方法, 音声信号, 音声情報) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (拡散処理) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (符号化器) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (ローパス, バンド) of a first non (拡散処理) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声復号方法, 音声信号, 音声情報) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声復号方法, 音声信号, 音声情報) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声復号方法, 音声信号, 音声情報) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery (符号化器) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (ローパス, バンド) of a first non (拡散処理) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JP2001051698A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder concealment, decoder recovery, decoder constructs, decoder determines concealment) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法 (speech signal, sound signal) であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記スペクトル包絡情報からスペクトル包絡振幅を求 め、周波数軸上で分割された帯域のうち、前記スペクト ル包絡振幅の値が最大となる帯域を決定し、 該決定された帯域と前記有声/無声識別情報に基づい て、各帯域毎に前記ピッチ周期情報に基づき発生される ピッチパルスと白色雑音を混合する際の混合比を決定 し、 該決定された混合比に基づいて各帯域毎の混合信号を作 成した後、全帯域の混合信号を加算して混合音源信号を 作成し、 該混合音源信号に対し前記スペクトル包絡情報および前 記ゲイン情報を付加して再生音声を生成することを特徴 とする音声復号方法




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JP2001013998A

Filed: 1999-06-30     Issued: 2001-01-19

音声復号化装置及び符号誤り補償方法

(Original Assignee) Matsushita Electric Ind Co Ltd; Nec Corp; 日本電気株式会社; 松下電器産業株式会社     

Hiroyuki Ebara, Kazunori Ozawa, Masahiro Serizawa, Koji Yoshida, 幸司 吉田, 一範 小澤, 宏幸 江原, 芹沢  昌宏
US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声復号, する記録) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JP2001013998A
CLAIM 1
【請求項1】 モード情報、ラグパラメータ、及びゲイ ンパラメータを含む符号化された伝送パラメータを有す るデータを受信する受信手段と、前記モード情報、ラグ パラメータ、及びゲインパラメータを復号化する復号化 手段と、前記データに対して誤りが検出された復号単位 において、前記復号単位よりも過去の復号単位に対する モード情報を用い、前記復号単位に用いるラグパラメー タ及びゲインパラメータを適応的に決定する決定手段 と、を具備することを特徴とする音声復号 (speech signal) 化装置。

JP2001013998A
CLAIM 23
【請求項23】 プログラムを格納し、コンピュータに より読み取り可能な記録媒体であって、前記プログラム は、モード情報、ラグパラメータ、及びゲインパラメー タを含む符号化された伝送パラメータを有するデータに おける前記モード情報、ラグパラメータ、及びゲインパ ラメータを復号化する手順と、前記データに対して誤り が検出された復号単位において、前記復号単位よりも過 去の復号単位に対するモード情報を用い、前記モード情 報の示すモードが有声モードである場合に、適応音源ゲ インの比率を高くし、前記モード情報の示すモードが過 渡モード又は無声モードである場合に、適応音源ゲイン の比率を低くするように、適応音源ゲインと固定音源ゲ インとの間のゲイン比率を制御する手順と、を含むこと を特徴とする記録 (speech signal) 媒体。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (音声復号, する記録) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JP2001013998A
CLAIM 1
【請求項1】 モード情報、ラグパラメータ、及びゲイ ンパラメータを含む符号化された伝送パラメータを有す るデータを受信する受信手段と、前記モード情報、ラグ パラメータ、及びゲインパラメータを復号化する復号化 手段と、前記データに対して誤りが検出された復号単位 において、前記復号単位よりも過去の復号単位に対する モード情報を用い、前記復号単位に用いるラグパラメー タ及びゲインパラメータを適応的に決定する決定手段 と、を具備することを特徴とする音声復号 (speech signal) 化装置。

JP2001013998A
CLAIM 23
【請求項23】 プログラムを格納し、コンピュータに より読み取り可能な記録媒体であって、前記プログラム は、モード情報、ラグパラメータ、及びゲインパラメー タを含む符号化された伝送パラメータを有するデータに おける前記モード情報、ラグパラメータ、及びゲインパ ラメータを復号化する手順と、前記データに対して誤り が検出された復号単位において、前記復号単位よりも過 去の復号単位に対するモード情報を用い、前記モード情 報の示すモードが有声モードである場合に、適応音源ゲ インの比率を高くし、前記モード情報の示すモードが過 渡モード又は無声モードである場合に、適応音源ゲイン の比率を低くするように、適応音源ゲインと固定音源ゲ インとの間のゲイン比率を制御する手順と、を含むこと を特徴とする記録 (speech signal) 媒体。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (音声復号, する記録) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2001013998A
CLAIM 1
【請求項1】 モード情報、ラグパラメータ、及びゲイ ンパラメータを含む符号化された伝送パラメータを有す るデータを受信する受信手段と、前記モード情報、ラグ パラメータ、及びゲインパラメータを復号化する復号化 手段と、前記データに対して誤りが検出された復号単位 において、前記復号単位よりも過去の復号単位に対する モード情報を用い、前記復号単位に用いるラグパラメー タ及びゲインパラメータを適応的に決定する決定手段 と、を具備することを特徴とする音声復号 (speech signal) 化装置。

JP2001013998A
CLAIM 23
【請求項23】 プログラムを格納し、コンピュータに より読み取り可能な記録媒体であって、前記プログラム は、モード情報、ラグパラメータ、及びゲインパラメー タを含む符号化された伝送パラメータを有するデータに おける前記モード情報、ラグパラメータ、及びゲインパ ラメータを復号化する手順と、前記データに対して誤り が検出された復号単位において、前記復号単位よりも過 去の復号単位に対するモード情報を用い、前記モード情 報の示すモードが有声モードである場合に、適応音源ゲ インの比率を高くし、前記モード情報の示すモードが過 渡モード又は無声モードである場合に、適応音源ゲイン の比率を低くするように、適応音源ゲインと固定音源ゲ インとの間のゲイン比率を制御する手順と、を含むこと を特徴とする記録 (speech signal) 媒体。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声復号, する記録) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2001013998A
CLAIM 1
【請求項1】 モード情報、ラグパラメータ、及びゲイ ンパラメータを含む符号化された伝送パラメータを有す るデータを受信する受信手段と、前記モード情報、ラグ パラメータ、及びゲインパラメータを復号化する復号化 手段と、前記データに対して誤りが検出された復号単位 において、前記復号単位よりも過去の復号単位に対する モード情報を用い、前記復号単位に用いるラグパラメー タ及びゲインパラメータを適応的に決定する決定手段 と、を具備することを特徴とする音声復号 (speech signal) 化装置。

JP2001013998A
CLAIM 23
【請求項23】 プログラムを格納し、コンピュータに より読み取り可能な記録媒体であって、前記プログラム は、モード情報、ラグパラメータ、及びゲインパラメー タを含む符号化された伝送パラメータを有するデータに おける前記モード情報、ラグパラメータ、及びゲインパ ラメータを復号化する手順と、前記データに対して誤り が検出された復号単位において、前記復号単位よりも過 去の復号単位に対するモード情報を用い、前記モード情 報の示すモードが有声モードである場合に、適応音源ゲ インの比率を高くし、前記モード情報の示すモードが過 渡モード又は無声モードである場合に、適応音源ゲイン の比率を低くするように、適応音源ゲインと固定音源ゲ インとの間のゲイン比率を制御する手順と、を含むこと を特徴とする記録 (speech signal) 媒体。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (音声復号, する記録) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JP2001013998A
CLAIM 1
【請求項1】 モード情報、ラグパラメータ、及びゲイ ンパラメータを含む符号化された伝送パラメータを有す るデータを受信する受信手段と、前記モード情報、ラグ パラメータ、及びゲインパラメータを復号化する復号化 手段と、前記データに対して誤りが検出された復号単位 において、前記復号単位よりも過去の復号単位に対する モード情報を用い、前記復号単位に用いるラグパラメー タ及びゲインパラメータを適応的に決定する決定手段 と、を具備することを特徴とする音声復号 (speech signal) 化装置。

JP2001013998A
CLAIM 23
【請求項23】 プログラムを格納し、コンピュータに より読み取り可能な記録媒体であって、前記プログラム は、モード情報、ラグパラメータ、及びゲインパラメー タを含む符号化された伝送パラメータを有するデータに おける前記モード情報、ラグパラメータ、及びゲインパ ラメータを復号化する手順と、前記データに対して誤り が検出された復号単位において、前記復号単位よりも過 去の復号単位に対するモード情報を用い、前記モード情 報の示すモードが有声モードである場合に、適応音源ゲ インの比率を高くし、前記モード情報の示すモードが過 渡モード又は無声モードである場合に、適応音源ゲイン の比率を低くするように、適応音源ゲインと固定音源ゲ インとの間のゲイン比率を制御する手順と、を含むこと を特徴とする記録 (speech signal) 媒体。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (音声復号, する記録) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2001013998A
CLAIM 1
【請求項1】 モード情報、ラグパラメータ、及びゲイ ンパラメータを含む符号化された伝送パラメータを有す るデータを受信する受信手段と、前記モード情報、ラグ パラメータ、及びゲインパラメータを復号化する復号化 手段と、前記データに対して誤りが検出された復号単位 において、前記復号単位よりも過去の復号単位に対する モード情報を用い、前記復号単位に用いるラグパラメー タ及びゲインパラメータを適応的に決定する決定手段 と、を具備することを特徴とする音声復号 (speech signal) 化装置。

JP2001013998A
CLAIM 23
【請求項23】 プログラムを格納し、コンピュータに より読み取り可能な記録媒体であって、前記プログラム は、モード情報、ラグパラメータ、及びゲインパラメー タを含む符号化された伝送パラメータを有するデータに おける前記モード情報、ラグパラメータ、及びゲインパ ラメータを復号化する手順と、前記データに対して誤り が検出された復号単位において、前記復号単位よりも過 去の復号単位に対するモード情報を用い、前記モード情 報の示すモードが有声モードである場合に、適応音源ゲ インの比率を高くし、前記モード情報の示すモードが過 渡モード又は無声モードである場合に、適応音源ゲイン の比率を低くするように、適応音源ゲインと固定音源ゲ インとの間のゲイン比率を制御する手順と、を含むこと を特徴とする記録 (speech signal) 媒体。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声復号, する記録) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2001013998A
CLAIM 1
【請求項1】 モード情報、ラグパラメータ、及びゲイ ンパラメータを含む符号化された伝送パラメータを有す るデータを受信する受信手段と、前記モード情報、ラグ パラメータ、及びゲインパラメータを復号化する復号化 手段と、前記データに対して誤りが検出された復号単位 において、前記復号単位よりも過去の復号単位に対する モード情報を用い、前記復号単位に用いるラグパラメー タ及びゲインパラメータを適応的に決定する決定手段 と、を具備することを特徴とする音声復号 (speech signal) 化装置。

JP2001013998A
CLAIM 23
【請求項23】 プログラムを格納し、コンピュータに より読み取り可能な記録媒体であって、前記プログラム は、モード情報、ラグパラメータ、及びゲインパラメー タを含む符号化された伝送パラメータを有するデータに おける前記モード情報、ラグパラメータ、及びゲインパ ラメータを復号化する手順と、前記データに対して誤り が検出された復号単位において、前記復号単位よりも過 去の復号単位に対するモード情報を用い、前記モード情 報の示すモードが有声モードである場合に、適応音源ゲ インの比率を高くし、前記モード情報の示すモードが過 渡モード又は無声モードである場合に、適応音源ゲイン の比率を低くするように、適応音源ゲインと固定音源ゲ インとの間のゲイン比率を制御する手順と、を含むこと を特徴とする記録 (speech signal) 媒体。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
WO9966494A1

Filed: 1999-06-16     Issued: 1999-12-23

Improved lost frame recovery techniques for parametric, lpc-based speech coding systems

(Original Assignee) Comsat Corporation     

Grant Ian Ho, Marion Baraniecki, Suat Yeldener
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (successive frames) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
WO9966494A1
CLAIM 1
. A method of recovering a lost frame in a system of the type wherein information is transmitted as successive frames (onset frame) of encoded signals and the information is reconstructed from said encoded signals at a receiver , said method comprising : storing encoded signals from a first frame prior to said lost frame ;
storing encoded signals from a second frame subsequent to said lost frame ;
and interpolating between the encoded signals from said first and second frames to obtain recovered encoded signals for said lost frame .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy (signal energy) for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (fixed codebook) per sample for other frames .
WO9966494A1
CLAIM 4
. A method according to claim 1 , wherein each frame includes a plurality of subframes , said method comprising the step of comparing a signal energy (signal energy) for each subframe of a particular frame against a threshold , and attenuating signal energies for all subframes in said particular frame if the signal energy in any subframe exceeds said threshold .

WO9966494A1
CLAIM 6
. A method according to claim 2 , wherein said encoded signals include said LSP parameters , fixed codebook (average energy) gains and further excitation signals , said method comprising interpolating said fixed codebook gain of said lost frame from the fixed codebook gains of said first and second frames , and adopting said further excitation signals from said first frame as the further excitation signals of said lost frame .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
WO9966494A1
CLAIM 2
. A method according to claim 1 , wherein said encoded signals include a plurality of Line Spectral Pair (LSP) parameters corresponding to each frame , and said interpolating step comprises i (LP filter) nterpolating between the LSP parameters of said first frame and the LSP parameters of said second frame .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
WO9966494A1
CLAIM 2
. A method according to claim 1 , wherein said encoded signals include a plurality of Line Spectral Pair (LSP) parameters corresponding to each frame , and said interpolating step comprises i (LP filter) nterpolating between the LSP parameters of said first frame and the LSP parameters of said second frame .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
WO9966494A1
CLAIM 2
. A method according to claim 1 , wherein said encoded signals include a plurality of Line Spectral Pair (LSP) parameters corresponding to each frame , and said interpolating step comprises i (LP filter) nterpolating between the LSP parameters of said first frame and the LSP parameters of said second frame .

US7693710B2
CLAIM 13
. A device for conducting concealment (second frames, lost frame) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (successive frames) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
WO9966494A1
CLAIM 1
. A method of recovering a lost frame (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) in a system of the type wherein information is transmitted as successive frames (onset frame) of encoded signals and the information is reconstructed from said encoded signals at a receiver , said method comprising : storing encoded signals from a first frame prior to said lost frame ;
storing encoded signals from a second frame subsequent to said lost frame ;
and interpolating between the encoded signals from said first and second frames (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) to obtain recovered encoded signals for said lost frame .

US7693710B2
CLAIM 14
. A device for conducting concealment (second frames, lost frame) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
WO9966494A1
CLAIM 1
. A method of recovering a lost frame (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) in a system of the type wherein information is transmitted as successive frames of encoded signals and the information is reconstructed from said encoded signals at a receiver , said method comprising : storing encoded signals from a first frame prior to said lost frame ;
storing encoded signals from a second frame subsequent to said lost frame ;
and interpolating between the encoded signals from said first and second frames (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) to obtain recovered encoded signals for said lost frame .

US7693710B2
CLAIM 15
. A device for conducting concealment (second frames, lost frame) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
WO9966494A1
CLAIM 1
. A method of recovering a lost frame (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) in a system of the type wherein information is transmitted as successive frames of encoded signals and the information is reconstructed from said encoded signals at a receiver , said method comprising : storing encoded signals from a first frame prior to said lost frame ;
storing encoded signals from a second frame subsequent to said lost frame ;
and interpolating between the encoded signals from said first and second frames (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) to obtain recovered encoded signals for said lost frame .

US7693710B2
CLAIM 16
. A device for conducting concealment (second frames, lost frame) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (signal energy) for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
WO9966494A1
CLAIM 1
. A method of recovering a lost frame (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) in a system of the type wherein information is transmitted as successive frames of encoded signals and the information is reconstructed from said encoded signals at a receiver , said method comprising : storing encoded signals from a first frame prior to said lost frame ;
storing encoded signals from a second frame subsequent to said lost frame ;
and interpolating between the encoded signals from said first and second frames (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) to obtain recovered encoded signals for said lost frame .

WO9966494A1
CLAIM 4
. A method according to claim 1 , wherein each frame includes a plurality of subframes , said method comprising the step of comparing a signal energy (signal energy) for each subframe of a particular frame against a threshold , and attenuating signal energies for all subframes in said particular frame if the signal energy in any subframe exceeds said threshold .

WO9966494A1
CLAIM 6
. A method according to claim 2 , wherein said encoded signals include said LSP parameters , fixed codebook (average energy) gains and further excitation signals , said method comprising interpolating said fixed codebook gain of said lost frame from the fixed codebook gains of said first and second frames , and adopting said further excitation signals from said first frame as the further excitation signals of said lost frame .

US7693710B2
CLAIM 17
. A device for conducting concealment (second frames, lost frame) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
WO9966494A1
CLAIM 1
. A method of recovering a lost frame (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) in a system of the type wherein information is transmitted as successive frames of encoded signals and the information is reconstructed from said encoded signals at a receiver , said method comprising : storing encoded signals from a first frame prior to said lost frame ;
storing encoded signals from a second frame subsequent to said lost frame ;
and interpolating between the encoded signals from said first and second frames (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) to obtain recovered encoded signals for said lost frame .

US7693710B2
CLAIM 20
. A device for conducting concealment (second frames, lost frame) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
WO9966494A1
CLAIM 1
. A method of recovering a lost frame (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) in a system of the type wherein information is transmitted as successive frames of encoded signals and the information is reconstructed from said encoded signals at a receiver , said method comprising : storing encoded signals from a first frame prior to said lost frame ;
storing encoded signals from a second frame subsequent to said lost frame ;
and interpolating between the encoded signals from said first and second frames (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) to obtain recovered encoded signals for said lost frame .

WO9966494A1
CLAIM 2
. A method according to claim 1 , wherein said encoded signals include a plurality of Line Spectral Pair (LSP) parameters corresponding to each frame , and said interpolating step comprises i (LP filter) nterpolating between the LSP parameters of said first frame and the LSP parameters of said second frame .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
WO9966494A1
CLAIM 2
. A method according to claim 1 , wherein said encoded signals include a plurality of Line Spectral Pair (LSP) parameters corresponding to each frame , and said interpolating step comprises i (LP filter) nterpolating between the LSP parameters of said first frame and the LSP parameters of said second frame .

US7693710B2
CLAIM 22
. A device for conducting concealment (second frames, lost frame) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
WO9966494A1
CLAIM 1
. A method of recovering a lost frame (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) in a system of the type wherein information is transmitted as successive frames of encoded signals and the information is reconstructed from said encoded signals at a receiver , said method comprising : storing encoded signals from a first frame prior to said lost frame ;
storing encoded signals from a second frame subsequent to said lost frame ;
and interpolating between the encoded signals from said first and second frames (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) to obtain recovered encoded signals for said lost frame .

US7693710B2
CLAIM 23
. A device for conducting concealment (second frames, lost frame) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
WO9966494A1
CLAIM 1
. A method of recovering a lost frame (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) in a system of the type wherein information is transmitted as successive frames of encoded signals and the information is reconstructed from said encoded signals at a receiver , said method comprising : storing encoded signals from a first frame prior to said lost frame ;
storing encoded signals from a second frame subsequent to said lost frame ;
and interpolating between the encoded signals from said first and second frames (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) to obtain recovered encoded signals for said lost frame .

US7693710B2
CLAIM 24
. A device for conducting concealment (second frames, lost frame) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (signal energy) for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
WO9966494A1
CLAIM 1
. A method of recovering a lost frame (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) in a system of the type wherein information is transmitted as successive frames of encoded signals and the information is reconstructed from said encoded signals at a receiver , said method comprising : storing encoded signals from a first frame prior to said lost frame ;
storing encoded signals from a second frame subsequent to said lost frame ;
and interpolating between the encoded signals from said first and second frames (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) to obtain recovered encoded signals for said lost frame .

WO9966494A1
CLAIM 4
. A method according to claim 1 , wherein each frame includes a plurality of subframes , said method comprising the step of comparing a signal energy (signal energy) for each subframe of a particular frame against a threshold , and attenuating signal energies for all subframes in said particular frame if the signal energy in any subframe exceeds said threshold .

WO9966494A1
CLAIM 6
. A method according to claim 2 , wherein said encoded signals include said LSP parameters , fixed codebook (average energy) gains and further excitation signals , said method comprising interpolating said fixed codebook gain of said lost frame from the fixed codebook gains of said first and second frames , and adopting said further excitation signals from said first frame as the further excitation signals of said lost frame .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (second frames, lost frame) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
WO9966494A1
CLAIM 1
. A method of recovering a lost frame (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) in a system of the type wherein information is transmitted as successive frames of encoded signals and the information is reconstructed from said encoded signals at a receiver , said method comprising : storing encoded signals from a first frame prior to said lost frame ;
storing encoded signals from a second frame subsequent to said lost frame ;
and interpolating between the encoded signals from said first and second frames (frame concealment, decoder concealment, determining concealment, conducting concealment, decoder determines concealment) to obtain recovered encoded signals for said lost frame .

WO9966494A1
CLAIM 2
. A method according to claim 1 , wherein said encoded signals include a plurality of Line Spectral Pair (LSP) parameters corresponding to each frame , and said interpolating step comprises i (LP filter) nterpolating between the LSP parameters of said first frame and the LSP parameters of said second frame .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
CN1274456A

Filed: 1999-05-18     Issued: 2000-11-22

语音编码器

(Original Assignee) 萨里大学     

S·P·维勒特, A·M·康多兹
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (事先确定, 谱幅度) (事先确定, 谱幅度) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response (的第二个) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个 (first non, first impulse response) 量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (事先确定, 谱幅度) (事先确定, 谱幅度) and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (事先确定, 谱幅度) (事先确定, 谱幅度) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (事先确定, 谱幅度) (事先确定, 谱幅度) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (事先确定, 谱幅度) (事先确定, 谱幅度) and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (的第二个) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个 (first non, first impulse response) 量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (的第二个) erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment (事先确定, 谱幅度) (事先确定, 谱幅度) and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个 (first non, first impulse response) 量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (的第二个) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个 (first non, first impulse response) 量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (事先确定, 谱幅度) (事先确定, 谱幅度) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (的第二个) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (从第一) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个 (first non, first impulse response) 量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

CN1274456A
CLAIM 40
. 一种语音编码器,其中,含有一个对输入语音信号进行编码的编码器,该编码器包括对输入语音信号取样以产生数字样本并将该样本划分为每帧含有事先确定个数数字样本的帧的装置,对每帧样本进行分析并为每帧前导部分和结尾部分分别产生一组线性频谱频率(LSF)系数的线性预测编码装置,为每帧样本确定至少一个基音值的基音确定装置,用于定义每帧中浊音和清音信号量度的语音装置,用于为每帧样本产生幅值信息的幅值确定装置,用于量化前述LSF系数组,基音值,浊音和清音信号量度及幅值信息来产生一组量化指标的量化装置,其中,前述量化装置通过下式为当前帧的前导部分定义了一组量化的LSF系数(LSF’2):LSF’2=αLSF’1+(1-α)LSF’3式中LSF’3和LSF’1分别为量化的当前帧的尾段和紧邻前一帧LSF系数,α为第一个矢量量化码表中的一个矢量,定义当前帧的每组前导和尾段部分的LSF系数LSF’2和LSF’3为第二个矢量量化码表中相应LSF量化矢量Q2,Q3及相应预期值P2,P3的组合,此处P2=λQ1,P3=λQ2,λ为一常数,Q1为紧邻前一帧尾段的LSF量化矢量,分别从第一 (LP filter excitation signal) 个矢量码表和第二个矢量码表中选择前述矢量Q3和前述矢量α,以最大限度减低由线性预测编码装置产生的当前帧的LSF系数(LSF2,LSF3)与相应的量化LSF系数(LSF’2,LSF’3)之间的失真。

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (从第一) produced in the decoder during the received first non (的第二个) erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个 (first non, first impulse response) 量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 40
. 一种语音编码器,其中,含有一个对输入语音信号进行编码的编码器,该编码器包括对输入语音信号取样以产生数字样本并将该样本划分为每帧含有事先确定个数数字样本的帧的装置,对每帧样本进行分析并为每帧前导部分和结尾部分分别产生一组线性频谱频率(LSF)系数的线性预测编码装置,为每帧样本确定至少一个基音值的基音确定装置,用于定义每帧中浊音和清音信号量度的语音装置,用于为每帧样本产生幅值信息的幅值确定装置,用于量化前述LSF系数组,基音值,浊音和清音信号量度及幅值信息来产生一组量化指标的量化装置,其中,前述量化装置通过下式为当前帧的前导部分定义了一组量化的LSF系数(LSF’2):LSF’2=αLSF’1+(1-α)LSF’3式中LSF’3和LSF’1分别为量化的当前帧的尾段和紧邻前一帧LSF系数,α为第一个矢量量化码表中的一个矢量,定义当前帧的每组前导和尾段部分的LSF系数LSF’2和LSF’3为第二个矢量量化码表中相应LSF量化矢量Q2,Q3及相应预期值P2,P3的组合,此处P2=λQ1,P3=λQ2,λ为一常数,Q1为紧邻前一帧尾段的LSF量化矢量,分别从第一 (LP filter excitation signal) 个矢量码表和第二个矢量码表中选择前述矢量Q3和前述矢量α,以最大限度减低由线性预测编码装置产生的当前帧的LSF系数(LSF2,LSF3)与相应的量化LSF系数(LSF’2,LSF’3)之间的失真。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment (事先确定, 谱幅度) (事先确定, 谱幅度) and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (的第二个) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (从第一) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个 (first non, first impulse response) 量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

CN1274456A
CLAIM 40
. 一种语音编码器,其中,含有一个对输入语音信号进行编码的编码器,该编码器包括对输入语音信号取样以产生数字样本并将该样本划分为每帧含有事先确定个数数字样本的帧的装置,对每帧样本进行分析并为每帧前导部分和结尾部分分别产生一组线性频谱频率(LSF)系数的线性预测编码装置,为每帧样本确定至少一个基音值的基音确定装置,用于定义每帧中浊音和清音信号量度的语音装置,用于为每帧样本产生幅值信息的幅值确定装置,用于量化前述LSF系数组,基音值,浊音和清音信号量度及幅值信息来产生一组量化指标的量化装置,其中,前述量化装置通过下式为当前帧的前导部分定义了一组量化的LSF系数(LSF’2):LSF’2=αLSF’1+(1-α)LSF’3式中LSF’3和LSF’1分别为量化的当前帧的尾段和紧邻前一帧LSF系数,α为第一个矢量量化码表中的一个矢量,定义当前帧的每组前导和尾段部分的LSF系数LSF’2和LSF’3为第二个矢量量化码表中相应LSF量化矢量Q2,Q3及相应预期值P2,P3的组合,此处P2=λQ1,P3=λQ2,λ为一常数,Q1为紧邻前一帧尾段的LSF量化矢量,分别从第一 (LP filter excitation signal) 个矢量码表和第二个矢量码表中选择前述矢量Q3和前述矢量α,以最大限度减低由线性预测编码装置产生的当前帧的LSF系数(LSF2,LSF3)与相应的量化LSF系数(LSF’2,LSF’3)之间的失真。

US7693710B2
CLAIM 13
. A device for conducting concealment (事先确定, 谱幅度) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment (事先确定, 谱幅度) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment (事先确定, 谱幅度) and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response (的第二个) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个 (first non, first impulse response) 量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

US7693710B2
CLAIM 14
. A device for conducting concealment (事先确定, 谱幅度) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (事先确定, 谱幅度) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

US7693710B2
CLAIM 15
. A device for conducting concealment (事先确定, 谱幅度) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (事先确定, 谱幅度) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

US7693710B2
CLAIM 16
. A device for conducting concealment (事先确定, 谱幅度) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (事先确定, 谱幅度) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

US7693710B2
CLAIM 17
. A device for conducting concealment (事先确定, 谱幅度) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (事先确定, 谱幅度) and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment (事先确定, 谱幅度) and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (的第二个) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个 (first non, first impulse response) 量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (的第二个) erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment (事先确定, 谱幅度) (事先确定, 谱幅度) and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个 (first non, first impulse response) 量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (的第二个) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个 (first non, first impulse response) 量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

US7693710B2
CLAIM 20
. A device for conducting concealment (事先确定, 谱幅度) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (事先确定, 谱幅度) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non (的第二个) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (从第一) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个 (first non, first impulse response) 量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

CN1274456A
CLAIM 40
. 一种语音编码器,其中,含有一个对输入语音信号进行编码的编码器,该编码器包括对输入语音信号取样以产生数字样本并将该样本划分为每帧含有事先确定个数数字样本的帧的装置,对每帧样本进行分析并为每帧前导部分和结尾部分分别产生一组线性频谱频率(LSF)系数的线性预测编码装置,为每帧样本确定至少一个基音值的基音确定装置,用于定义每帧中浊音和清音信号量度的语音装置,用于为每帧样本产生幅值信息的幅值确定装置,用于量化前述LSF系数组,基音值,浊音和清音信号量度及幅值信息来产生一组量化指标的量化装置,其中,前述量化装置通过下式为当前帧的前导部分定义了一组量化的LSF系数(LSF’2):LSF’2=αLSF’1+(1-α)LSF’3式中LSF’3和LSF’1分别为量化的当前帧的尾段和紧邻前一帧LSF系数,α为第一个矢量量化码表中的一个矢量,定义当前帧的每组前导和尾段部分的LSF系数LSF’2和LSF’3为第二个矢量量化码表中相应LSF量化矢量Q2,Q3及相应预期值P2,P3的组合,此处P2=λQ1,P3=λQ2,λ为一常数,Q1为紧邻前一帧尾段的LSF量化矢量,分别从第一 (LP filter excitation signal) 个矢量码表和第二个矢量码表中选择前述矢量Q3和前述矢量α,以最大限度减低由线性预测编码装置产生的当前帧的LSF系数(LSF2,LSF3)与相应的量化LSF系数(LSF’2,LSF’3)之间的失真。

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (从第一) produced in the decoder during the received first non (的第二个) erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个 (first non, first impulse response) 量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 40
. 一种语音编码器,其中,含有一个对输入语音信号进行编码的编码器,该编码器包括对输入语音信号取样以产生数字样本并将该样本划分为每帧含有事先确定个数数字样本的帧的装置,对每帧样本进行分析并为每帧前导部分和结尾部分分别产生一组线性频谱频率(LSF)系数的线性预测编码装置,为每帧样本确定至少一个基音值的基音确定装置,用于定义每帧中浊音和清音信号量度的语音装置,用于为每帧样本产生幅值信息的幅值确定装置,用于量化前述LSF系数组,基音值,浊音和清音信号量度及幅值信息来产生一组量化指标的量化装置,其中,前述量化装置通过下式为当前帧的前导部分定义了一组量化的LSF系数(LSF’2):LSF’2=αLSF’1+(1-α)LSF’3式中LSF’3和LSF’1分别为量化的当前帧的尾段和紧邻前一帧LSF系数,α为第一个矢量量化码表中的一个矢量,定义当前帧的每组前导和尾段部分的LSF系数LSF’2和LSF’3为第二个矢量量化码表中相应LSF量化矢量Q2,Q3及相应预期值P2,P3的组合,此处P2=λQ1,P3=λQ2,λ为一常数,Q1为紧邻前一帧尾段的LSF量化矢量,分别从第一 (LP filter excitation signal) 个矢量码表和第二个矢量码表中选择前述矢量Q3和前述矢量α,以最大限度减低由线性预测编码装置产生的当前帧的LSF系数(LSF2,LSF3)与相应的量化LSF系数(LSF’2,LSF’3)之间的失真。

US7693710B2
CLAIM 22
. A device for conducting concealment (事先确定, 谱幅度) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

US7693710B2
CLAIM 23
. A device for conducting concealment (事先确定, 谱幅度) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

US7693710B2
CLAIM 24
. A device for conducting concealment (事先确定, 谱幅度) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment (事先确定, 谱幅度) (事先确定, 谱幅度) and decoder recovery when a gain of a LP filter of a first non (的第二个) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (从第一) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
CN1274456A
CLAIM 1
. 一种语音编码器,该语音编码器含有对划分为帧的,每帧含有事先确定 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) 个数数字样本的输入语音信号进行编码的编码器,该编码器包括(采用):·为每帧样本产生至少一组线性预测系数的线性预测编码装置,·为每帧样本确定至少一个基音值的基音确定装置,该基音确定装置包括:用频域技术分析样本(频域分析)的第一种估计装置,用时域技术分析样本(时域分析)的第二种估计装置及由所述频域分析和时域分析结果导出所述基音值的基音评价装置,·用于定义每帧中浊音和清音信号量度的语音装置,·用于为每帧样本产生幅值信息的幅值确定装置,·用于量化前述线性预测系数,基音值,浊音和清音信号量度及幅值信息来为每帧样本产生一组量化指标的量化装置,其中,前述第一种估计装置对若干候选基音中的每个基音值生成以个量度,前述第二种估计装置对相同候选基音中的每个基音值生成相应的第二个 (first non, first impulse response) 量度,前述基音评价装置至少组合若干以上第一个量度和与之相应的第二个量度并通过引用该组合结果选出候选基音其中之一。

CN1274456A
CLAIM 19
. 一种如权利要求1到18任何一项所要求的语音编码器,其中,幅值决定装置为每帧产生一组以由前述基音决定装置决定的基音值有关的谐波频率为中心的不同频段的频谱幅度 (conducting frame erasure concealment, frame erasure concealment, conducting concealment) ,并且前述量化装置将这些频谱幅度量化以产生一幅度量化指标的第一部分。

CN1274456A
CLAIM 40
. 一种语音编码器,其中,含有一个对输入语音信号进行编码的编码器,该编码器包括对输入语音信号取样以产生数字样本并将该样本划分为每帧含有事先确定个数数字样本的帧的装置,对每帧样本进行分析并为每帧前导部分和结尾部分分别产生一组线性频谱频率(LSF)系数的线性预测编码装置,为每帧样本确定至少一个基音值的基音确定装置,用于定义每帧中浊音和清音信号量度的语音装置,用于为每帧样本产生幅值信息的幅值确定装置,用于量化前述LSF系数组,基音值,浊音和清音信号量度及幅值信息来产生一组量化指标的量化装置,其中,前述量化装置通过下式为当前帧的前导部分定义了一组量化的LSF系数(LSF’2):LSF’2=αLSF’1+(1-α)LSF’3式中LSF’3和LSF’1分别为量化的当前帧的尾段和紧邻前一帧LSF系数,α为第一个矢量量化码表中的一个矢量,定义当前帧的每组前导和尾段部分的LSF系数LSF’2和LSF’3为第二个矢量量化码表中相应LSF量化矢量Q2,Q3及相应预期值P2,P3的组合,此处P2=λQ1,P3=λQ2,λ为一常数,Q1为紧邻前一帧尾段的LSF量化矢量,分别从第一 (LP filter excitation signal) 个矢量码表和第二个矢量码表中选择前述矢量Q3和前述矢量α,以最大限度减低由线性预测编码装置产生的当前帧的LSF系数(LSF2,LSF3)与相应的量化LSF系数(LSF’2,LSF’3)之间的失真。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6351730B2

Filed: 1999-03-30     Issued: 2002-02-26

Low-complexity, low-delay, scalable and embedded speech and audio coding with adaptive frame loss concealment

(Original Assignee) Nokia of America Corp     (Current Assignee) Nokia of America Corp

Juin-Hwey Chen
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (successive time) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6351730B2
CLAIM 1
. A system for processing audio signals comprising : (a) a frame extractor for dividing an input audio signal into a plurality of signal frames corresponding to successive time (pitch period) intervals ;
(b) a transform processor for performing transform computation of the input audio signal in at least one signal frame , said transform processor generating a transform signal having one or more (NB) bands ;
(c) a quantizer providing quantized values associated with the transform signal in said NB bands ;
(d) an output processor for forming an output bit stream corresponding to an encoded version of the input audio signal ;
and (e) a decoder capable of recontructing from the output bit stream at least two replicas of the input audio signal , each replica having a different sampling rate , without using downsampling .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (successive time) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6351730B2
CLAIM 1
. A system for processing audio signals comprising : (a) a frame extractor for dividing an input audio signal into a plurality of signal frames corresponding to successive time (pitch period) intervals ;
(b) a transform processor for performing transform computation of the input audio signal in at least one signal frame , said transform processor generating a transform signal having one or more (NB) bands ;
(c) a quantizer providing quantized values associated with the transform signal in said NB bands ;
(d) an output processor for forming an output bit stream corresponding to an encoded version of the input audio signal ;
and (e) a decoder capable of recontructing from the output bit stream at least two replicas of the input audio signal , each replica having a different sampling rate , without using downsampling .

US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (point D) per sample for other frames .
US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US6351730B2
CLAIM 36
. The method of claim 35 , wherein the step of directly synthesizing at a ¼ sampling rate without downsampling comprises computing a (M/4)-point D (average energy) CT type IV for the first quarter of the received DCT coefficients , as follows : y n = 2 (M / 4)  ∑ k = 0 M 4 - 1  X k     cos    [ (n + 1 2)  (k + 1 2)  π (M / 4) ] where y n = 2  ⌊ 2 M  ∑ k = 0 M 4 - 1  X k     cos    [ ((4  n + 3 2) + 1 2)  (k + 1 2)  π M ] ⌋ = 2  x ~ 4  n + 3 / 2 so that {tilde over (X)} 4n+3/2 ½ y n where : x ~ n = 2 M  ∑ k = 0 M 4 - 1  X k     cos    [ (n + 1 2)  (k + 1 2)  π M ] and using the above quantities in a DCT type IV inverse computation to obtain the reconstructed output signal having a ¼ sampling rate .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy (desired bit rate) of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US6351730B2
CLAIM 40
. An embedded coding method for use in processing of an audio signal divided into frames corresponding to successive time intervals , where for each input frame at least one transform domain computation is performed and the resulting transform coefficients are divided into NB bands , each band having at least one transform coefficient , the method comprising : for a pre-specified first bit rate providing a first output bit stream which comprises information about transform coefficients in M 1 ≦NB bands and information about the average power in the M 1 bands , and wherein bit allocation is determined based on a target signal-to-noise ratio (TSNR) in the NB bands , said first output bit stream being sufficient to reconstruct a representation of the audio signal ;
for at least a second pre-specified bit rate higher than the first bit rate , providing an output bit stream embedding said first output bit stream and further comprising information about transform coefficients in M 2 bands , where M 1 ≦M 2 ≦NB , and information about the average power in the M 2 bands , and wherein bit allocation is determined based on the difference between the TSNR in the NB bands and a value determined by the number of bits allocated to each band at the next-lower bit rate ;
and reconstructing a representation of the input signal using an embedded bit stream corresponding to the desired bit rate (controlling energy) .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (main band) and the first non erased frame received after frame erasure is encoded as active speech .
US6351730B2
CLAIM 4
. The system of claim 3 wherein the bandwidth BW(i) of the i-th transform domain band (comfort noise) is given by the expression BW (i)= BI (i +1)− BI (i) where BI(i) is an array containing the indices of corresponding to the transform domain boundaries between bands , and the log-gains are calculated as LG  (i) = log 2  (1 NTPF × BW  (i)  ∑ m = 0 NTPF - 1  ∑ k = BI  (i) BI  (i + 1) - 1  T 2  (k , m)) ,    i = 0 , 1 , 2 , …    , NB - 1 .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (delay processing, formula i) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6351730B2
CLAIM 17
. The method of claim 11 wherein the size of the frame is selected relatively short to enable low algorithmic delay processing (LP filter excitation signal) .

US6351730B2
CLAIM 25
. The method of claim 24 wherein the bit allocation formula i (LP filter excitation signal) s modified to : R k = R + 1 2     (lg     (k) -    1 BI  (NB)  ∑ j = 0 BI  (NB) - 1  lg     (j)) , or − R k =R+ ½ [lg (k)− {overscore (lg)}] , where lg(k)=LGQ(i) , for k=BI(i) , BI(i)+1 , . . . , BI(i+1)−1 , and LGQ(i) is the quantized log-gain in the i-th band ;
and lg =    1 BI  (NB)  ∑ i = 0 NB - 1  [ BI     (i + 1) - BI  (i) ]  LGQ  (i) , is the average quantized log-gain averaged over all frequency bands .

US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US6351730B2
CLAIM 40
. An embedded coding method for use in processing of an audio signal divided into frames corresponding to successive time intervals , where for each input frame at least one transform domain computation is performed and the resulting transform coefficients are divided into NB bands , each band having at least one transform coefficient , the method comprising : for a pre-specified first bit rate providing a first output bit stream which comprises i (LP filter) nformation about transform coefficients in M 1 ≦NB bands and information about the average power in the M 1 bands , and wherein bit allocation is determined based on a target signal-to-noise ratio (TSNR) in the NB bands , said first output bit stream being sufficient to reconstruct a representation of the audio signal ;
for at least a second pre-specified bit rate higher than the first bit rate , providing an output bit stream embedding said first output bit stream and further comprising information about transform coefficients in M 2 bands , where M 1 ≦M 2 ≦NB , and information about the average power in the M 2 bands , and wherein bit allocation is determined based on the difference between the TSNR in the NB bands and a value determined by the number of bits allocated to each band at the next-lower bit rate ;
and reconstructing a representation of the input signal using an embedded bit stream corresponding to the desired bit rate .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (comprises i) excitation signal (delay processing, formula i) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (Discrete Cosine, current frame, frame loss) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6351730B2
CLAIM 9
. The system of claim 8 wherein the decoder comprises an adaptive frame loss (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) concealment processor operating to reduce the effect of missing frames on the quality of the output signal .

US6351730B2
CLAIM 12
. The method of claim 11 using M transforms for each signal frame , said transforms performed over partially overlapping windows which cover the audio signal in a current frame (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) and least one adjacent frame , wherein the overlapping portion is equal to 1/M of the frame size .

US6351730B2
CLAIM 14
. The method of claim 11 wherein said at least two relatively short-size transforms are Modified Discrete Cosine (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) Transforms (MDCTs) .

US6351730B2
CLAIM 17
. The method of claim 11 wherein the size of the frame is selected relatively short to enable low algorithmic delay processing (LP filter excitation signal) .

US6351730B2
CLAIM 25
. The method of claim 24 wherein the bit allocation formula i (LP filter excitation signal) s modified to : R k = R + 1 2     (lg     (k) -    1 BI  (NB)  ∑ j = 0 BI  (NB) - 1  lg     (j)) , or − R k =R+ ½ [lg (k)− {overscore (lg)}] , where lg(k)=LGQ(i) , for k=BI(i) , BI(i)+1 , . . . , BI(i+1)−1 , and LGQ(i) is the quantized log-gain in the i-th band ;
and lg =    1 BI  (NB)  ∑ i = 0 NB - 1  [ BI     (i + 1) - BI  (i) ]  LGQ  (i) , is the average quantized log-gain averaged over all frequency bands .

US6351730B2
CLAIM 40
. An embedded coding method for use in processing of an audio signal divided into frames corresponding to successive time intervals , where for each input frame at least one transform domain computation is performed and the resulting transform coefficients are divided into NB bands , each band having at least one transform coefficient , the method comprising : for a pre-specified first bit rate providing a first output bit stream which comprises i (LP filter) nformation about transform coefficients in M 1 ≦NB bands and information about the average power in the M 1 bands , and wherein bit allocation is determined based on a target signal-to-noise ratio (TSNR) in the NB bands , said first output bit stream being sufficient to reconstruct a representation of the audio signal ;
for at least a second pre-specified bit rate higher than the first bit rate , providing an output bit stream embedding said first output bit stream and further comprising information about transform coefficients in M 2 bands , where M 1 ≦M 2 ≦NB , and information about the average power in the M 2 bands , and wherein bit allocation is determined based on the difference between the TSNR in the NB bands and a value determined by the number of bits allocated to each band at the next-lower bit rate ;
and reconstructing a representation of the input signal using an embedded bit stream corresponding to the desired bit rate .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (successive time) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6351730B2
CLAIM 1
. A system for processing audio signals comprising : (a) a frame extractor for dividing an input audio signal into a plurality of signal frames corresponding to successive time (pitch period) intervals ;
(b) a transform processor for performing transform computation of the input audio signal in at least one signal frame , said transform processor generating a transform signal having one or more (NB) bands ;
(c) a quantizer providing quantized values associated with the transform signal in said NB bands ;
(d) an output processor for forming an output bit stream corresponding to an encoded version of the input audio signal ;
and (e) a decoder capable of recontructing from the output bit stream at least two replicas of the input audio signal , each replica having a different sampling rate , without using downsampling .

US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (time lag) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (delay processing, formula i) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (Discrete Cosine, current frame, frame loss) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6351730B2
CLAIM 9
. The system of claim 8 wherein the decoder comprises an adaptive frame loss (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) concealment processor operating to reduce the effect of missing frames on the quality of the output signal .

US6351730B2
CLAIM 10
. The system of claim 9 wherein the adaptive frame loss concealment processor computes an optimum time lag (replacement frame) for waveform signal interpolation .

US6351730B2
CLAIM 12
. The method of claim 11 using M transforms for each signal frame , said transforms performed over partially overlapping windows which cover the audio signal in a current frame (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) and least one adjacent frame , wherein the overlapping portion is equal to 1/M of the frame size .

US6351730B2
CLAIM 14
. The method of claim 11 wherein said at least two relatively short-size transforms are Modified Discrete Cosine (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) Transforms (MDCTs) .

US6351730B2
CLAIM 17
. The method of claim 11 wherein the size of the frame is selected relatively short to enable low algorithmic delay processing (LP filter excitation signal) .

US6351730B2
CLAIM 25
. The method of claim 24 wherein the bit allocation formula i (LP filter excitation signal) s modified to : R k = R + 1 2     (lg     (k) -    1 BI  (NB)  ∑ j = 0 BI  (NB) - 1  lg     (j)) , or − R k =R+ ½ [lg (k)− {overscore (lg)}] , where lg(k)=LGQ(i) , for k=BI(i) , BI(i)+1 , . . . , BI(i+1)−1 , and LGQ(i) is the quantized log-gain in the i-th band ;
and lg =    1 BI  (NB)  ∑ i = 0 NB - 1  [ BI     (i + 1) - BI  (i) ]  LGQ  (i) , is the average quantized log-gain averaged over all frequency bands .

US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US6351730B2
CLAIM 40
. An embedded coding method for use in processing of an audio signal divided into frames corresponding to successive time intervals , where for each input frame at least one transform domain computation is performed and the resulting transform coefficients are divided into NB bands , each band having at least one transform coefficient , the method comprising : for a pre-specified first bit rate providing a first output bit stream which comprises i (LP filter) nformation about transform coefficients in M 1 ≦NB bands and information about the average power in the M 1 bands , and wherein bit allocation is determined based on a target signal-to-noise ratio (TSNR) in the NB bands , said first output bit stream being sufficient to reconstruct a representation of the audio signal ;
for at least a second pre-specified bit rate higher than the first bit rate , providing an output bit stream embedding said first output bit stream and further comprising information about transform coefficients in M 2 bands , where M 1 ≦M 2 ≦NB , and information about the average power in the M 2 bands , and wherein bit allocation is determined based on the difference between the TSNR in the NB bands and a value determined by the number of bits allocated to each band at the next-lower bit rate ;
and reconstructing a representation of the input signal using an embedded bit stream corresponding to the desired bit rate .

US7693710B2
CLAIM 13
. A device for conducting concealment (Discrete Cosine, current frame, frame loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (successive time) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6351730B2
CLAIM 1
. A system for processing audio signals comprising : (a) a frame extractor for dividing an input audio signal into a plurality of signal frames corresponding to successive time (pitch period) intervals ;
(b) a transform processor for performing transform computation of the input audio signal in at least one signal frame , said transform processor generating a transform signal having one or more (NB) bands ;
(c) a quantizer providing quantized values associated with the transform signal in said NB bands ;
(d) an output processor for forming an output bit stream corresponding to an encoded version of the input audio signal ;
and (e) a decoder capable of recontructing from the output bit stream at least two replicas of the input audio signal , each replica having a different sampling rate , without using downsampling .

US6351730B2
CLAIM 9
. The system of claim 8 wherein the decoder comprises an adaptive frame loss (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) concealment processor operating to reduce the effect of missing frames on the quality of the output signal .

US6351730B2
CLAIM 12
. The method of claim 11 using M transforms for each signal frame , said transforms performed over partially overlapping windows which cover the audio signal in a current frame (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) and least one adjacent frame , wherein the overlapping portion is equal to 1/M of the frame size .

US6351730B2
CLAIM 14
. The method of claim 11 wherein said at least two relatively short-size transforms are Modified Discrete Cosine (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) Transforms (MDCTs) .

US7693710B2
CLAIM 14
. A device for conducting concealment (Discrete Cosine, current frame, frame loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6351730B2
CLAIM 9
. The system of claim 8 wherein the decoder comprises an adaptive frame loss (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) concealment processor operating to reduce the effect of missing frames on the quality of the output signal .

US6351730B2
CLAIM 12
. The method of claim 11 using M transforms for each signal frame , said transforms performed over partially overlapping windows which cover the audio signal in a current frame (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) and least one adjacent frame , wherein the overlapping portion is equal to 1/M of the frame size .

US6351730B2
CLAIM 14
. The method of claim 11 wherein said at least two relatively short-size transforms are Modified Discrete Cosine (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) Transforms (MDCTs) .

US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US7693710B2
CLAIM 15
. A device for conducting concealment (Discrete Cosine, current frame, frame loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (successive time) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6351730B2
CLAIM 1
. A system for processing audio signals comprising : (a) a frame extractor for dividing an input audio signal into a plurality of signal frames corresponding to successive time (pitch period) intervals ;
(b) a transform processor for performing transform computation of the input audio signal in at least one signal frame , said transform processor generating a transform signal having one or more (NB) bands ;
(c) a quantizer providing quantized values associated with the transform signal in said NB bands ;
(d) an output processor for forming an output bit stream corresponding to an encoded version of the input audio signal ;
and (e) a decoder capable of recontructing from the output bit stream at least two replicas of the input audio signal , each replica having a different sampling rate , without using downsampling .

US6351730B2
CLAIM 9
. The system of claim 8 wherein the decoder comprises an adaptive frame loss (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) concealment processor operating to reduce the effect of missing frames on the quality of the output signal .

US6351730B2
CLAIM 12
. The method of claim 11 using M transforms for each signal frame , said transforms performed over partially overlapping windows which cover the audio signal in a current frame (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) and least one adjacent frame , wherein the overlapping portion is equal to 1/M of the frame size .

US6351730B2
CLAIM 14
. The method of claim 11 wherein said at least two relatively short-size transforms are Modified Discrete Cosine (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) Transforms (MDCTs) .

US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US7693710B2
CLAIM 16
. A device for conducting concealment (Discrete Cosine, current frame, frame loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (point D) per sample for other frames .
US6351730B2
CLAIM 9
. The system of claim 8 wherein the decoder comprises an adaptive frame loss (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) concealment processor operating to reduce the effect of missing frames on the quality of the output signal .

US6351730B2
CLAIM 12
. The method of claim 11 using M transforms for each signal frame , said transforms performed over partially overlapping windows which cover the audio signal in a current frame (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) and least one adjacent frame , wherein the overlapping portion is equal to 1/M of the frame size .

US6351730B2
CLAIM 14
. The method of claim 11 wherein said at least two relatively short-size transforms are Modified Discrete Cosine (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) Transforms (MDCTs) .

US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US6351730B2
CLAIM 36
. The method of claim 35 , wherein the step of directly synthesizing at a ¼ sampling rate without downsampling comprises computing a (M/4)-point D (average energy) CT type IV for the first quarter of the received DCT coefficients , as follows : y n = 2 (M / 4)  ∑ k = 0 M 4 - 1  X k     cos    [ (n + 1 2)  (k + 1 2)  π (M / 4) ] where y n = 2  ⌊ 2 M  ∑ k = 0 M 4 - 1  X k     cos    [ ((4  n + 3 2) + 1 2)  (k + 1 2)  π M ] ⌋ = 2  x ~ 4  n + 3 / 2 so that {tilde over (X)} 4n+3/2 ½ y n where : x ~ n = 2 M  ∑ k = 0 M 4 - 1  X k     cos    [ (n + 1 2)  (k + 1 2)  π M ] and using the above quantities in a DCT type IV inverse computation to obtain the reconstructed output signal having a ¼ sampling rate .

US7693710B2
CLAIM 17
. A device for conducting concealment (Discrete Cosine, current frame, frame loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6351730B2
CLAIM 9
. The system of claim 8 wherein the decoder comprises an adaptive frame loss (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) concealment processor operating to reduce the effect of missing frames on the quality of the output signal .

US6351730B2
CLAIM 12
. The method of claim 11 using M transforms for each signal frame , said transforms performed over partially overlapping windows which cover the audio signal in a current frame (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) and least one adjacent frame , wherein the overlapping portion is equal to 1/M of the frame size .

US6351730B2
CLAIM 14
. The method of claim 11 wherein said at least two relatively short-size transforms are Modified Discrete Cosine (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) Transforms (MDCTs) .

US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (main band) and the first non erased frame received after frame erasure is encoded as active speech .
US6351730B2
CLAIM 4
. The system of claim 3 wherein the bandwidth BW(i) of the i-th transform domain band (comfort noise) is given by the expression BW (i)= BI (i +1)− BI (i) where BI(i) is an array containing the indices of corresponding to the transform domain boundaries between bands , and the log-gains are calculated as LG  (i) = log 2  (1 NTPF × BW  (i)  ∑ m = 0 NTPF - 1  ∑ k = BI  (i) BI  (i + 1) - 1  T 2  (k , m)) ,    i = 0 , 1 , 2 , …    , NB - 1 .

US7693710B2
CLAIM 20
. A device for conducting concealment (Discrete Cosine, current frame, frame loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (delay processing, formula i) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6351730B2
CLAIM 9
. The system of claim 8 wherein the decoder comprises an adaptive frame loss (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) concealment processor operating to reduce the effect of missing frames on the quality of the output signal .

US6351730B2
CLAIM 12
. The method of claim 11 using M transforms for each signal frame , said transforms performed over partially overlapping windows which cover the audio signal in a current frame (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) and least one adjacent frame , wherein the overlapping portion is equal to 1/M of the frame size .

US6351730B2
CLAIM 14
. The method of claim 11 wherein said at least two relatively short-size transforms are Modified Discrete Cosine (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) Transforms (MDCTs) .

US6351730B2
CLAIM 17
. The method of claim 11 wherein the size of the frame is selected relatively short to enable low algorithmic delay processing (LP filter excitation signal) .

US6351730B2
CLAIM 25
. The method of claim 24 wherein the bit allocation formula i (LP filter excitation signal) s modified to : R k = R + 1 2     (lg     (k) -    1 BI  (NB)  ∑ j = 0 BI  (NB) - 1  lg     (j)) , or − R k =R+ ½ [lg (k)− {overscore (lg)}] , where lg(k)=LGQ(i) , for k=BI(i) , BI(i)+1 , . . . , BI(i+1)−1 , and LGQ(i) is the quantized log-gain in the i-th band ;
and lg =    1 BI  (NB)  ∑ i = 0 NB - 1  [ BI     (i + 1) - BI  (i) ]  LGQ  (i) , is the average quantized log-gain averaged over all frequency bands .

US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US6351730B2
CLAIM 40
. An embedded coding method for use in processing of an audio signal divided into frames corresponding to successive time intervals , where for each input frame at least one transform domain computation is performed and the resulting transform coefficients are divided into NB bands , each band having at least one transform coefficient , the method comprising : for a pre-specified first bit rate providing a first output bit stream which comprises i (LP filter) nformation about transform coefficients in M 1 ≦NB bands and information about the average power in the M 1 bands , and wherein bit allocation is determined based on a target signal-to-noise ratio (TSNR) in the NB bands , said first output bit stream being sufficient to reconstruct a representation of the audio signal ;
for at least a second pre-specified bit rate higher than the first bit rate , providing an output bit stream embedding said first output bit stream and further comprising information about transform coefficients in M 2 bands , where M 1 ≦M 2 ≦NB , and information about the average power in the M 2 bands , and wherein bit allocation is determined based on the difference between the TSNR in the NB bands and a value determined by the number of bits allocated to each band at the next-lower bit rate ;
and reconstructing a representation of the input signal using an embedded bit stream corresponding to the desired bit rate .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (comprises i) excitation signal (delay processing, formula i) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (Discrete Cosine, current frame, frame loss) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6351730B2
CLAIM 9
. The system of claim 8 wherein the decoder comprises an adaptive frame loss (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) concealment processor operating to reduce the effect of missing frames on the quality of the output signal .

US6351730B2
CLAIM 12
. The method of claim 11 using M transforms for each signal frame , said transforms performed over partially overlapping windows which cover the audio signal in a current frame (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) and least one adjacent frame , wherein the overlapping portion is equal to 1/M of the frame size .

US6351730B2
CLAIM 14
. The method of claim 11 wherein said at least two relatively short-size transforms are Modified Discrete Cosine (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) Transforms (MDCTs) .

US6351730B2
CLAIM 17
. The method of claim 11 wherein the size of the frame is selected relatively short to enable low algorithmic delay processing (LP filter excitation signal) .

US6351730B2
CLAIM 25
. The method of claim 24 wherein the bit allocation formula i (LP filter excitation signal) s modified to : R k = R + 1 2     (lg     (k) -    1 BI  (NB)  ∑ j = 0 BI  (NB) - 1  lg     (j)) , or − R k =R+ ½ [lg (k)− {overscore (lg)}] , where lg(k)=LGQ(i) , for k=BI(i) , BI(i)+1 , . . . , BI(i+1)−1 , and LGQ(i) is the quantized log-gain in the i-th band ;
and lg =    1 BI  (NB)  ∑ i = 0 NB - 1  [ BI     (i + 1) - BI  (i) ]  LGQ  (i) , is the average quantized log-gain averaged over all frequency bands .

US6351730B2
CLAIM 40
. An embedded coding method for use in processing of an audio signal divided into frames corresponding to successive time intervals , where for each input frame at least one transform domain computation is performed and the resulting transform coefficients are divided into NB bands , each band having at least one transform coefficient , the method comprising : for a pre-specified first bit rate providing a first output bit stream which comprises i (LP filter) nformation about transform coefficients in M 1 ≦NB bands and information about the average power in the M 1 bands , and wherein bit allocation is determined based on a target signal-to-noise ratio (TSNR) in the NB bands , said first output bit stream being sufficient to reconstruct a representation of the audio signal ;
for at least a second pre-specified bit rate higher than the first bit rate , providing an output bit stream embedding said first output bit stream and further comprising information about transform coefficients in M 2 bands , where M 1 ≦M 2 ≦NB , and information about the average power in the M 2 bands , and wherein bit allocation is determined based on the difference between the TSNR in the NB bands and a value determined by the number of bits allocated to each band at the next-lower bit rate ;
and reconstructing a representation of the input signal using an embedded bit stream corresponding to the desired bit rate .

US7693710B2
CLAIM 22
. A device for conducting concealment (Discrete Cosine, current frame, frame loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6351730B2
CLAIM 9
. The system of claim 8 wherein the decoder comprises an adaptive frame loss (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) concealment processor operating to reduce the effect of missing frames on the quality of the output signal .

US6351730B2
CLAIM 12
. The method of claim 11 using M transforms for each signal frame , said transforms performed over partially overlapping windows which cover the audio signal in a current frame (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) and least one adjacent frame , wherein the overlapping portion is equal to 1/M of the frame size .

US6351730B2
CLAIM 14
. The method of claim 11 wherein said at least two relatively short-size transforms are Modified Discrete Cosine (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) Transforms (MDCTs) .

US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US7693710B2
CLAIM 23
. A device for conducting concealment (Discrete Cosine, current frame, frame loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (successive time) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6351730B2
CLAIM 1
. A system for processing audio signals comprising : (a) a frame extractor for dividing an input audio signal into a plurality of signal frames corresponding to successive time (pitch period) intervals ;
(b) a transform processor for performing transform computation of the input audio signal in at least one signal frame , said transform processor generating a transform signal having one or more (NB) bands ;
(c) a quantizer providing quantized values associated with the transform signal in said NB bands ;
(d) an output processor for forming an output bit stream corresponding to an encoded version of the input audio signal ;
and (e) a decoder capable of recontructing from the output bit stream at least two replicas of the input audio signal , each replica having a different sampling rate , without using downsampling .

US6351730B2
CLAIM 9
. The system of claim 8 wherein the decoder comprises an adaptive frame loss (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) concealment processor operating to reduce the effect of missing frames on the quality of the output signal .

US6351730B2
CLAIM 12
. The method of claim 11 using M transforms for each signal frame , said transforms performed over partially overlapping windows which cover the audio signal in a current frame (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) and least one adjacent frame , wherein the overlapping portion is equal to 1/M of the frame size .

US6351730B2
CLAIM 14
. The method of claim 11 wherein said at least two relatively short-size transforms are Modified Discrete Cosine (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) Transforms (MDCTs) .

US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US7693710B2
CLAIM 24
. A device for conducting concealment (Discrete Cosine, current frame, frame loss) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (point D) per sample for other frames .
US6351730B2
CLAIM 9
. The system of claim 8 wherein the decoder comprises an adaptive frame loss (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) concealment processor operating to reduce the effect of missing frames on the quality of the output signal .

US6351730B2
CLAIM 12
. The method of claim 11 using M transforms for each signal frame , said transforms performed over partially overlapping windows which cover the audio signal in a current frame (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) and least one adjacent frame , wherein the overlapping portion is equal to 1/M of the frame size .

US6351730B2
CLAIM 14
. The method of claim 11 wherein said at least two relatively short-size transforms are Modified Discrete Cosine (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) Transforms (MDCTs) .

US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US6351730B2
CLAIM 36
. The method of claim 35 , wherein the step of directly synthesizing at a ¼ sampling rate without downsampling comprises computing a (M/4)-point D (average energy) CT type IV for the first quarter of the received DCT coefficients , as follows : y n = 2 (M / 4)  ∑ k = 0 M 4 - 1  X k     cos    [ (n + 1 2)  (k + 1 2)  π (M / 4) ] where y n = 2  ⌊ 2 M  ∑ k = 0 M 4 - 1  X k     cos    [ ((4  n + 3 2) + 1 2)  (k + 1 2)  π M ] ⌋ = 2  x ~ 4  n + 3 / 2 so that {tilde over (X)} 4n+3/2 ½ y n where : x ~ n = 2 M  ∑ k = 0 M 4 - 1  X k     cos    [ (n + 1 2)  (k + 1 2)  π M ] and using the above quantities in a DCT type IV inverse computation to obtain the reconstructed output signal having a ¼ sampling rate .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame, sample values) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (time lag) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (Discrete Cosine, current frame, frame loss) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (delay processing, formula i) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (Discrete Cosine, current frame, frame loss) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6351730B2
CLAIM 9
. The system of claim 8 wherein the decoder comprises an adaptive frame loss (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) concealment processor operating to reduce the effect of missing frames on the quality of the output signal .

US6351730B2
CLAIM 10
. The system of claim 9 wherein the adaptive frame loss concealment processor computes an optimum time lag (replacement frame) for waveform signal interpolation .

US6351730B2
CLAIM 12
. The method of claim 11 using M transforms for each signal frame , said transforms performed over partially overlapping windows which cover the audio signal in a current frame (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) and least one adjacent frame , wherein the overlapping portion is equal to 1/M of the frame size .

US6351730B2
CLAIM 14
. The method of claim 11 wherein said at least two relatively short-size transforms are Modified Discrete Cosine (decoder determines concealment, current frame, frame concealment, determining concealment, conducting concealment) Transforms (MDCTs) .

US6351730B2
CLAIM 17
. The method of claim 11 wherein the size of the frame is selected relatively short to enable low algorithmic delay processing (LP filter excitation signal) .

US6351730B2
CLAIM 25
. The method of claim 24 wherein the bit allocation formula i (LP filter excitation signal) s modified to : R k = R + 1 2     (lg     (k) -    1 BI  (NB)  ∑ j = 0 BI  (NB) - 1  lg     (j)) , or − R k =R+ ½ [lg (k)− {overscore (lg)}] , where lg(k)=LGQ(i) , for k=BI(i) , BI(i)+1 , . . . , BI(i+1)−1 , and LGQ(i) is the quantized log-gain in the i-th band ;
and lg =    1 BI  (NB)  ∑ i = 0 NB - 1  [ BI     (i + 1) - BI  (i) ]  LGQ  (i) , is the average quantized log-gain averaged over all frequency bands .

US6351730B2
CLAIM 26
. A method for adaptive frame loss concealment in processing of audio signals divided into frames corresponding to successive time intervals , where for each input frame one or more transform domain computations are performed over partially overlapping windows covering the audio signal , and output synthesis is performed using an overlap-and- add method , the method comprising : in a sequence of received frames identifying a frame as missing ;
analyzing the immediately preceding frame (signal classification parameter) to determine an optimum time lag for waveform signal extrapolation ;
based on the determined optimum time lag performing waveform signal extrapolation to synthesize a first portion of the missing frame , said synthesis using information already available as part of the preceding frame to minimize discontinuities at the frame boundary ;
and performing waveform signal extrapolation in the remaining portion of the missing frame .

US6351730B2
CLAIM 31
. The method of claim 30 wherein a measure of discontinuities is computed in terms of both waveform sample values (signal classification parameter) and waveform slope .

US6351730B2
CLAIM 40
. An embedded coding method for use in processing of an audio signal divided into frames corresponding to successive time intervals , where for each input frame at least one transform domain computation is performed and the resulting transform coefficients are divided into NB bands , each band having at least one transform coefficient , the method comprising : for a pre-specified first bit rate providing a first output bit stream which comprises i (LP filter) nformation about transform coefficients in M 1 ≦NB bands and information about the average power in the M 1 bands , and wherein bit allocation is determined based on a target signal-to-noise ratio (TSNR) in the NB bands , said first output bit stream being sufficient to reconstruct a representation of the audio signal ;
for at least a second pre-specified bit rate higher than the first bit rate , providing an output bit stream embedding said first output bit stream and further comprising information about transform coefficients in M 2 bands , where M 1 ≦M 2 ≦NB , and information about the average power in the M 2 bands , and wherein bit allocation is determined based on the difference between the TSNR in the NB bands and a value determined by the number of bits allocated to each band at the next-lower bit rate ;
and reconstructing a representation of the input signal using an embedded bit stream corresponding to the desired bit rate .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JP2000267700A

Filed: 1999-03-17     Issued: 2000-09-29

音声符号化復号方法および装置

(Original Assignee) Yrp Kokino Idotai Tsushin Kenkyusho:Kk; 株式会社ワイ・アール・ピー高機能移動体通信研究所     

Seiji Sasaki, 誠司 佐々木
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (音声符号化方法, 符号化器) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (入力音) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (生成器, 音発生) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音 (first impulse) 声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (音声符号化方法, 符号化器) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (音声符号化方法, 符号化器) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (音声符号化方法, 符号化器) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声信号, 音声情報, なる周波数) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (音声符号化方法, 符号化器) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (拡散処理) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (音声信号, 音声情報, なる周波数) is a speech signal (音声信号, 音声情報, なる周波数) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (拡散処理) erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (音声符号化方法, 符号化器) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (音声信号, 音声情報, なる周波数) is a speech signal (音声信号, 音声情報, なる周波数) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (拡散処理) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (音声符号化方法, 符号化器) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (ローパス) of a first non (拡散処理) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号, 音声情報, なる周波数) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (音声符号化方法, 符号化器) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (ローパス) of a first non (拡散処理) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (音声符号化方法, 符号化器) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs (音声符号化方法, 符号化器) , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (入力音) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (生成器, 音発生) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音 (first impulse) 声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (音声符号化方法, 符号化器) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (音声符号化方法, 符号化器) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (音声符号化方法, 符号化器) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声信号, 音声情報, なる周波数) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (音声符号化方法, 符号化器) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (拡散処理) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (音声信号, 音声情報, なる周波数) is a speech signal (音声信号, 音声情報, なる周波数) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (拡散処理) erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (音声符号化方法, 符号化器) , limits to a given value a gain used for scaling the synthesized sound signal .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (音声信号, 音声情報, なる周波数) is a speech signal (音声信号, 音声情報, なる周波数) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (拡散処理) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (音声符号化方法, 符号化器) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (ローパス) of a first non (拡散処理) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声情報, なる周波数) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声信号, 音声情報, なる周波数) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号, 音声情報, なる周波数) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery (音声符号化方法, 符号化器) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (ローパス) of a first non (拡散処理) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JP2000267700A
CLAIM 1
【請求項1】 線形予測分析・合成方式の音声符号化器 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) によって音声信号 (speech signal, sound signal) が符号化処理された出力である音声情 報ビット列から音声信号を再生する音声復号方法であっ て、 前記音声情報 (speech signal, sound signal) ビット列に含まれるスペクトル包絡情報、 有声/無声識別情報、ピッチ周期情報およびゲイン情報 を分離、復号し、 前記有声/無声識別情報が有声を示すときには、前記ス ペクトル包絡情報により算出される周波数軸上のスペク トル包絡値と予め定めた閾値とを比較して、該スペクト ル包絡値が前記閾値以上になる周波数 (speech signal, sound signal) 領域を有声領域、 その他の領域を無声領域とし、有声領域の音源信号とし て前記ピッチ周期情報に基づき発生されるピッチパルス を用い、無声領域の音源信号として前記ピッチパルスと 白色雑音を所定の割合で混合した信号を用い、前記有声 領域の音源信号および前記無声領域の音源信号を加算し た結果を音源信号とし、 前記有声/無声識別情報が無声を示すときには、白色雑 音を音源信号とし、 該音源信号に対し前記スペクトル包絡情報および前記ゲ イン情報を付加して再生音声を生成することを特徴とす る音声復号方法。

JP2000267700A
CLAIM 3
【請求項3】 標本化され、予め定められた時間長の音 声符号化フレームに分割された入力音声信号から、有声 /無声識別情報、ピッチ周期情報、周期的ピッチか非周 期的ピッチかを示す非周期ピッチ情報を抽出して、符号 化する音声符号化方法 (decoder determines concealment, decoder concealment, decoder recovery, decoder constructs) であって、 前記非周期ピッチ情報が周期的ピッチを示す音声符号化 フレームでは、前記ピッチ周期情報を第1の所定のレベ ル数で量子化して、これを周期的ピッチ情報とし、 前記非周期ピッチ情報が非周期的ピッチを示す音声符号 化フレームでは、それぞれのピッチ範囲に対しその発生 度数の大小に応じた量子化レベルの割り当てを行い、第 2の所定のレベル数で量子化して、これを非周期的ピッ チ情報とし、 前記有声/無声識別情報が無声を示す状態に1つの符号 語を割り当て、前記有声/無声識別情報が有声を示す状 態として、前記周期的ピッチ情報に前記第1の所定のレ ベル数に対応する個数の符号語を割り当て、前記非周期 的ピッチ情報に前記第2の所定のレベル数に対応する個 数の符号語を割り当て、これらをまとめて所定のビット 数を有する符号語として符号化することを特徴とする音 声符号化方法。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6266632B1

Filed: 1999-03-15     Issued: 2001-07-24

Speech decoding apparatus and speech decoding method using energy of excitation parameter

(Original Assignee) Matsushita Graphic Communication Systems Inc     (Current Assignee) Panasonic System Solutions Japan Co Ltd

Kiminori Kato, Motoyasu Ohno
US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (correction unit, first sum) per sample for other frames .
US6266632B1
CLAIM 1
. A speech decoding apparatus comprising : a decoder that decodes a speech signal coded by CELP coding so as to include at least an excitation parameter , pitch information and LPC information ;
a controller that controls an output speech volume of the decoded speech signal according to a gain parameter ;
and a correction unit (average energy) that corrects the gain parameter according to an energy of the excitation parameter .

US6266632B1
CLAIM 10
. The speech decoding apparatus according to claim 9 , wherein said noise recognition unit comprising : a differential value detector that detects a differential value between energies of excitation parameters of adjacent subframes ;
a system that obtains a first sum (average energy) by adding the detected differential values for a plurality of previous subframes ;
a system that obtains a division value by dividing the first sum by a predetermined number ;
a system that obtains a second sum by adding the detected differential values that are less than a predetermined value for the plurality of previous subframes ;
and a system that recognizes a noise period by detecting a subframe whose second sum is more than the division value .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6266632B1
CLAIM 10
. The speech decoding apparatus according to claim 9 , wherein said noise recognition unit comprising : a differential value detector that detects a differential value between energies of excitation parameters of adjacent subframes ;
a system that obtains a first sum by adding the detected differential values for a plurality of previous subframes ;
a system that obtains a division value by dividing the first sum by a predetermined number (last non) ;
a system that obtains a second sum by adding the detected differential values that are less than a predetermined value for the plurality of previous subframes ;
and a system that recognizes a noise period by detecting a subframe whose second sum is more than the division value .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a, when i) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6266632B1
CLAIM 5
. The speech decoding apparatus according to claim 1 , wherein said correction unit increases the gain parameter with a large increment when i (LP filter excitation signal) ncreasing the gain parameter , and decreases the gain parameter with a small decrement when decreasing the gain parameter .

US6266632B1
CLAIM 15
. The speech decoding apparatus according to claim 1 , wherein the correction unit calculates the energy of the excitation parameter in a (n+1) subframe by the following equation : Ener(n+1)=Mamp(n+1)+((X−1)/X)×Ener(n) , wherein Ener represents the energy of the excitation parameter , Mamp represents a (LP filter excitation signal) n amount of the excitation parameter , and X is an arbitrary number .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (represents a, when i) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6266632B1
CLAIM 5
. The speech decoding apparatus according to claim 1 , wherein said correction unit increases the gain parameter with a large increment when i (LP filter excitation signal) ncreasing the gain parameter , and decreases the gain parameter with a small decrement when decreasing the gain parameter .

US6266632B1
CLAIM 10
. The speech decoding apparatus according to claim 9 , wherein said noise recognition unit comprising : a differential value detector that detects a differential value between energies of excitation parameters of adjacent subframes ;
a system that obtains a first sum by adding the detected differential values for a plurality of previous subframes ;
a system that obtains a division value by dividing the first sum by a predetermined number (last non) ;
a system that obtains a second sum by adding the detected differential values that are less than a predetermined value for the plurality of previous subframes ;
and a system that recognizes a noise period by detecting a subframe whose second sum is more than the division value .

US6266632B1
CLAIM 15
. The speech decoding apparatus according to claim 1 , wherein the correction unit calculates the energy of the excitation parameter in a (n+1) subframe by the following equation : Ener(n+1)=Mamp(n+1)+((X−1)/X)×Ener(n) , wherein Ener represents the energy of the excitation parameter , Mamp represents a (LP filter excitation signal) n amount of the excitation parameter , and X is an arbitrary number .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a, when i) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6266632B1
CLAIM 5
. The speech decoding apparatus according to claim 1 , wherein said correction unit increases the gain parameter with a large increment when i (LP filter excitation signal) ncreasing the gain parameter , and decreases the gain parameter with a small decrement when decreasing the gain parameter .

US6266632B1
CLAIM 10
. The speech decoding apparatus according to claim 9 , wherein said noise recognition unit comprising : a differential value detector that detects a differential value between energies of excitation parameters of adjacent subframes ;
a system that obtains a first sum by adding the detected differential values for a plurality of previous subframes ;
a system that obtains a division value by dividing the first sum by a predetermined number (last non) ;
a system that obtains a second sum by adding the detected differential values that are less than a predetermined value for the plurality of previous subframes ;
and a system that recognizes a noise period by detecting a subframe whose second sum is more than the division value .

US6266632B1
CLAIM 15
. The speech decoding apparatus according to claim 1 , wherein the correction unit calculates the energy of the excitation parameter in a (n+1) subframe by the following equation : Ener(n+1)=Mamp(n+1)+((X−1)/X)×Ener(n) , wherein Ener represents the energy of the excitation parameter , Mamp represents a (LP filter excitation signal) n amount of the excitation parameter , and X is an arbitrary number .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (correction unit, first sum) per sample for other frames .
US6266632B1
CLAIM 1
. A speech decoding apparatus comprising : a decoder that decodes a speech signal coded by CELP coding so as to include at least an excitation parameter , pitch information and LPC information ;
a controller that controls an output speech volume of the decoded speech signal according to a gain parameter ;
and a correction unit (average energy) that corrects the gain parameter according to an energy of the excitation parameter .

US6266632B1
CLAIM 10
. The speech decoding apparatus according to claim 9 , wherein said noise recognition unit comprising : a differential value detector that detects a differential value between energies of excitation parameters of adjacent subframes ;
a system that obtains a first sum (average energy) by adding the detected differential values for a plurality of previous subframes ;
a system that obtains a division value by dividing the first sum by a predetermined number ;
a system that obtains a second sum by adding the detected differential values that are less than a predetermined value for the plurality of previous subframes ;
and a system that recognizes a noise period by detecting a subframe whose second sum is more than the division value .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6266632B1
CLAIM 10
. The speech decoding apparatus according to claim 9 , wherein said noise recognition unit comprising : a differential value detector that detects a differential value between energies of excitation parameters of adjacent subframes ;
a system that obtains a first sum by adding the detected differential values for a plurality of previous subframes ;
a system that obtains a division value by dividing the first sum by a predetermined number (last non) ;
a system that obtains a second sum by adding the detected differential values that are less than a predetermined value for the plurality of previous subframes ;
and a system that recognizes a noise period by detecting a subframe whose second sum is more than the division value .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (represents a, when i) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6266632B1
CLAIM 5
. The speech decoding apparatus according to claim 1 , wherein said correction unit increases the gain parameter with a large increment when i (LP filter excitation signal) ncreasing the gain parameter , and decreases the gain parameter with a small decrement when decreasing the gain parameter .

US6266632B1
CLAIM 15
. The speech decoding apparatus according to claim 1 , wherein the correction unit calculates the energy of the excitation parameter in a (n+1) subframe by the following equation : Ener(n+1)=Mamp(n+1)+((X−1)/X)×Ener(n) , wherein Ener represents the energy of the excitation parameter , Mamp represents a (LP filter excitation signal) n amount of the excitation parameter , and X is an arbitrary number .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (represents a, when i) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6266632B1
CLAIM 5
. The speech decoding apparatus according to claim 1 , wherein said correction unit increases the gain parameter with a large increment when i (LP filter excitation signal) ncreasing the gain parameter , and decreases the gain parameter with a small decrement when decreasing the gain parameter .

US6266632B1
CLAIM 10
. The speech decoding apparatus according to claim 9 , wherein said noise recognition unit comprising : a differential value detector that detects a differential value between energies of excitation parameters of adjacent subframes ;
a system that obtains a first sum by adding the detected differential values for a plurality of previous subframes ;
a system that obtains a division value by dividing the first sum by a predetermined number (last non) ;
a system that obtains a second sum by adding the detected differential values that are less than a predetermined value for the plurality of previous subframes ;
and a system that recognizes a noise period by detecting a subframe whose second sum is more than the division value .

US6266632B1
CLAIM 15
. The speech decoding apparatus according to claim 1 , wherein the correction unit calculates the energy of the excitation parameter in a (n+1) subframe by the following equation : Ener(n+1)=Mamp(n+1)+((X−1)/X)×Ener(n) , wherein Ener represents the energy of the excitation parameter , Mamp represents a (LP filter excitation signal) n amount of the excitation parameter , and X is an arbitrary number .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (correction unit, first sum) per sample for other frames .
US6266632B1
CLAIM 1
. A speech decoding apparatus comprising : a decoder that decodes a speech signal coded by CELP coding so as to include at least an excitation parameter , pitch information and LPC information ;
a controller that controls an output speech volume of the decoded speech signal according to a gain parameter ;
and a correction unit (average energy) that corrects the gain parameter according to an energy of the excitation parameter .

US6266632B1
CLAIM 10
. The speech decoding apparatus according to claim 9 , wherein said noise recognition unit comprising : a differential value detector that detects a differential value between energies of excitation parameters of adjacent subframes ;
a system that obtains a first sum (average energy) by adding the detected differential values for a plurality of previous subframes ;
a system that obtains a division value by dividing the first sum by a predetermined number ;
a system that obtains a second sum by adding the detected differential values that are less than a predetermined value for the plurality of previous subframes ;
and a system that recognizes a noise period by detecting a subframe whose second sum is more than the division value .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (represents a, when i) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6266632B1
CLAIM 5
. The speech decoding apparatus according to claim 1 , wherein said correction unit increases the gain parameter with a large increment when i (LP filter excitation signal) ncreasing the gain parameter , and decreases the gain parameter with a small decrement when decreasing the gain parameter .

US6266632B1
CLAIM 10
. The speech decoding apparatus according to claim 9 , wherein said noise recognition unit comprising : a differential value detector that detects a differential value between energies of excitation parameters of adjacent subframes ;
a system that obtains a first sum by adding the detected differential values for a plurality of previous subframes ;
a system that obtains a division value by dividing the first sum by a predetermined number (last non) ;
a system that obtains a second sum by adding the detected differential values that are less than a predetermined value for the plurality of previous subframes ;
and a system that recognizes a noise period by detecting a subframe whose second sum is more than the division value .

US6266632B1
CLAIM 15
. The speech decoding apparatus according to claim 1 , wherein the correction unit calculates the energy of the excitation parameter in a (n+1) subframe by the following equation : Ener(n+1)=Mamp(n+1)+((X−1)/X)×Ener(n) , wherein Ener represents the energy of the excitation parameter , Mamp represents a (LP filter excitation signal) n amount of the excitation parameter , and X is an arbitrary number .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6240387B1

Filed: 1999-02-12     Issued: 2001-05-29

Method and apparatus for performing speech frame encoding mode selection in a variable rate encoding system

(Original Assignee) Qualcomm Inc     (Current Assignee) Qualcomm Inc

Andrew P. DeJaco
US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (third threshold) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6240387B1
CLAIM 1
. A method of encoding a speech frame , comprising the steps of : selecting a first encoding mode if a normalized autocorrelation measurement parameter is exceeded by a first threshold value and if a zero crossings count parameter exceeds a second threshold value ;
selecting a second encoding mode if the first encoding mode is not selected and if an energy differential measurement parameter is exceeded by a third threshold (current frame) value ;
selecting a third encoding mode if the first and second encoding modes are not selected and if an encoding quality parameter exceeds a fourth threshold value and if a prediction gain differential measurement parameter is exceeded by a fifth threshold value and if the normalized autocorrelation measurement parameter exceeds a sixth threshold value ;
selecting a fourth encoding mode if the first , second , and third encoding modes are not selected ;
and encoding the speech frame in accordance with the selected encoding mode .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (third threshold) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6240387B1
CLAIM 1
. A method of encoding a speech frame , comprising the steps of : selecting a first encoding mode if a normalized autocorrelation measurement parameter is exceeded by a first threshold value and if a zero crossings count parameter exceeds a second threshold value ;
selecting a second encoding mode if the first encoding mode is not selected and if an energy differential measurement parameter is exceeded by a third threshold (current frame) value ;
selecting a third encoding mode if the first and second encoding modes are not selected and if an encoding quality parameter exceeds a fourth threshold value and if a prediction gain differential measurement parameter is exceeded by a fifth threshold value and if the normalized autocorrelation measurement parameter exceeds a sixth threshold value ;
selecting a fourth encoding mode if the first , second , and third encoding modes are not selected ;
and encoding the speech frame in accordance with the selected encoding mode .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (third threshold) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6240387B1
CLAIM 1
. A method of encoding a speech frame , comprising the steps of : selecting a first encoding mode if a normalized autocorrelation measurement parameter is exceeded by a first threshold value and if a zero crossings count parameter exceeds a second threshold value ;
selecting a second encoding mode if the first encoding mode is not selected and if an energy differential measurement parameter is exceeded by a third threshold (current frame) value ;
selecting a third encoding mode if the first and second encoding modes are not selected and if an encoding quality parameter exceeds a fourth threshold value and if a prediction gain differential measurement parameter is exceeded by a fifth threshold value and if the normalized autocorrelation measurement parameter exceeds a sixth threshold value ;
selecting a fourth encoding mode if the first , second , and third encoding modes are not selected ;
and encoding the speech frame in accordance with the selected encoding mode .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (third threshold) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6240387B1
CLAIM 1
. A method of encoding a speech frame , comprising the steps of : selecting a first encoding mode if a normalized autocorrelation measurement parameter is exceeded by a first threshold value and if a zero crossings count parameter exceeds a second threshold value ;
selecting a second encoding mode if the first encoding mode is not selected and if an energy differential measurement parameter is exceeded by a third threshold (current frame) value ;
selecting a third encoding mode if the first and second encoding modes are not selected and if an encoding quality parameter exceeds a fourth threshold value and if a prediction gain differential measurement parameter is exceeded by a fifth threshold value and if the normalized autocorrelation measurement parameter exceeds a sixth threshold value ;
selecting a fourth encoding mode if the first , second , and third encoding modes are not selected ;
and encoding the speech frame in accordance with the selected encoding mode .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JP2000214900A

Filed: 1999-01-22     Issued: 2000-08-04

音声符号化/復号化方法

(Original Assignee) Toshiba Corp; 株式会社東芝     

Ko Amada, Katsumi Tsuchiya, 勝美 土谷, 皇 天田
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声信号, 音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号, 音声復号) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号, 音声復号) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JP2000214900A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声符号化方法。

JP2000214900A
CLAIM 4
【請求項4】駆動信号を合成フィルタに入力して音声信 号を復号化する方法であって、 前記駆動信号は、該駆動信号のサンプル点の位置に設定 される第1のパルスおよび該駆動信号のサンプル点とサ ンプル点との間の位置に設定される第2のパルスのいず れかから選択されたパルスを含むパルス列により構成さ れることを特徴とする音声復号 (sound signal, speech signal) 化方法。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
EP0932141A2

Filed: 1999-01-18     Issued: 1999-07-28

Method for signal controlled switching between different audio coding schemes

(Original Assignee) Deutsche Telekom AG     (Current Assignee) Deutsche Telekom AG

Ralf Kirchherr, Joachim Stegmann
US7693710B2
CLAIM 1
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (error concealment, current frame, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (error concealment, current frame, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (frame erasure) is classified as onset , conducting frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (error concealment, current frame, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (frame erasure) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (domain decoder) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

EP0932141A2
CLAIM 22
Multicode decoder comprising : a digital signal input (80) ;
a time domain decoder (LP filter) (60) for selectively receiving data from the digital signal input (10) ;
a transform decoder (70) for selectively receiving data from the digital signal input (81) ;
and switches (81 , 82) for switching the digital signal input (10) and a digital output (83) between the time domain decoder (60) and the transform decoder (70) .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (domain decoder) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (error concealment, current frame, speech signal) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (frame erasure) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

EP0932141A2
CLAIM 22
Multicode decoder comprising : a digital signal input (80) ;
a time domain decoder (LP filter) (60) for selectively receiving data from the digital signal input (10) ;
a transform decoder (70) for selectively receiving data from the digital signal input (81) ;
and switches (81 , 82) for switching the digital signal input (10) and a digital output (83) between the time domain decoder (60) and the transform decoder (70) .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment for frame erasure (frame erasure) s by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment for frame erasure (frame erasure) s by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (frame erasure) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (domain decoder) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (error concealment, current frame, speech signal) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

EP0932141A2
CLAIM 22
Multicode decoder comprising : a digital signal input (80) ;
a time domain decoder (LP filter) (60) for selectively receiving data from the digital signal input (10) ;
a transform decoder (70) for selectively receiving data from the digital signal input (81) ;
and switches (81 , 82) for switching the digital signal input (10) and a digital output (83) between the time domain decoder (60) and the transform decoder (70) .

US7693710B2
CLAIM 13
. A device for conducting concealment (error concealment, current frame, speech signal) of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 14
. A device for conducting concealment (error concealment, current frame, speech signal) of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 15
. A device for conducting concealment (error concealment, current frame, speech signal) of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 16
. A device for conducting concealment (error concealment, current frame, speech signal) of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (error concealment, current frame, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 17
. A device for conducting concealment (error concealment, current frame, speech signal) of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (error concealment, current frame, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (frame erasure) is classified as onset , the decoder , for conducting frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (error concealment, current frame, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (frame erasure) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 20
. A device for conducting concealment (error concealment, current frame, speech signal) of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (domain decoder) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

EP0932141A2
CLAIM 22
Multicode decoder comprising : a digital signal input (80) ;
a time domain decoder (LP filter) (60) for selectively receiving data from the digital signal input (10) ;
a transform decoder (70) for selectively receiving data from the digital signal input (81) ;
and switches (81 , 82) for switching the digital signal input (10) and a digital output (83) between the time domain decoder (60) and the transform decoder (70) .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (domain decoder) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (error concealment, current frame, speech signal) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (frame erasure) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

EP0932141A2
CLAIM 22
Multicode decoder comprising : a digital signal input (80) ;
a time domain decoder (LP filter) (60) for selectively receiving data from the digital signal input (10) ;
a transform decoder (70) for selectively receiving data from the digital signal input (81) ;
and switches (81 , 82) for switching the digital signal input (10) and a digital output (83) between the time domain decoder (60) and the transform decoder (70) .

US7693710B2
CLAIM 22
. A device for conducting concealment (error concealment, current frame, speech signal) of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 23
. A device for conducting concealment (error concealment, current frame, speech signal) of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 24
. A device for conducting concealment (error concealment, current frame, speech signal) of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (error concealment, current frame, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (frame erasure) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (error concealment, current frame, speech signal) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment (error concealment, current frame, speech signal) and decoder recovery when a gain of a LP filter (domain decoder) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (error concealment, current frame, speech signal) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
EP0932141A2
CLAIM 1
Method for signal controlled switching between audio coding schemes comprising : receiving input audio signals ;
classifying a first set of the input audio signals as speech or non-speech signal (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) s ;
coding the speech signals using a time domain coding scheme ;
and coding the nonspeech signals using a transform coding scheme .

EP0932141A2
CLAIM 5
Method according to claim 4 further comprising sampling the input audio signals so as to form a plurality of frames , the plurality of frames including a current frame (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) to be classified and a previous frame , the classifying step further including determining a difference between LSF coefficients of the current frame and the previous frame .

EP0932141A2
CLAIM 18
Method according to claim 17 providing error concealment (decoder concealment, frame erasure (frame erasure) concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment, current frame, speech signal) for frame erasures by continuing processing in the first mode , if the previous frame was processed in the first mode , and by processing in the fourth mode , if the previous frame was not processed in the first mode .

EP0932141A2
CLAIM 22
Multicode decoder comprising : a digital signal input (80) ;
a time domain decoder (LP filter) (60) for selectively receiving data from the digital signal input (10) ;
a transform decoder (70) for selectively receiving data from the digital signal input (81) ;
and switches (81 , 82) for switching the digital signal input (10) and a digital output (83) between the time domain decoder (60) and the transform decoder (70) .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5987406A

Filed: 1999-01-15     Issued: 1999-11-16

Instability eradication for analysis-by-synthesis speech codecs

(Original Assignee) Universite de Sherbrooke     (Current Assignee) Universite de Sherbrooke

Tero Honkanen, Claude Laflamme, Jean-Pierre Adoul
US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (signal component) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5987406A
CLAIM 1
. A cellular network element comprising (a) a transmitter including analysis-by-synthesis encoding means for encoding a speech signal and means for transmitting the encoded speech signal , and (b) a receiver including means for receiving a transmitted encoded speech signal and means for decoding the received encoded speech signal ;
wherein the analysis-by-synthesis speech signal encoding means of the transmitter is provided with an encoder system comprising : an analysis-by-synthesis encoder section for encoding the speech signal , comprising : first means for producing , in response to the speech signal and at regular time intervals called frames , a description of an innovation signal to be supplied as excitation signal to a synthesis filter in view of synthesizing said speech signal ;
second means for producing , in response to the speech signal and at said regular time intervals , a set of spectral parameters for use in driving the synthesis filter ;
and third means for producing , in response to the speech signal and at said regular time intervals , pitch information including a pitch gain for constricting a past-excitation-signal component (maximum amplitude) added to said excitation signal ;
and an instability eradication section comprising : detecting means for detecting a set of conditions related to the spectral parameters and the pitch gain ;
and modifying means for reducing the pitch gain to a value lower than a given threshold whenever the conditions of said set are detected in order to eradicate said occasional instability .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (high value) and the first non erased frame received after frame erasure is encoded as active speech .
US5987406A
CLAIM 2
. A cellular network element as recited in claim 1 , wherein the conditions of said set comprise : a resonance condition assessed from the spectral parameters ;
a duration condition detected when the resonance condition has prevailed for at least the M most recent frames , M being an integer greater than 1 ;
and a gain condition which evidences consistently-high value (comfort noise) s of the pitch gain in the N most recent frames , N being an integer greater than 1 .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (signal component) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5987406A
CLAIM 1
. A cellular network element comprising (a) a transmitter including analysis-by-synthesis encoding means for encoding a speech signal and means for transmitting the encoded speech signal , and (b) a receiver including means for receiving a transmitted encoded speech signal and means for decoding the received encoded speech signal ;
wherein the analysis-by-synthesis speech signal encoding means of the transmitter is provided with an encoder system comprising : an analysis-by-synthesis encoder section for encoding the speech signal , comprising : first means for producing , in response to the speech signal and at regular time intervals called frames , a description of an innovation signal to be supplied as excitation signal to a synthesis filter in view of synthesizing said speech signal ;
second means for producing , in response to the speech signal and at said regular time intervals , a set of spectral parameters for use in driving the synthesis filter ;
and third means for producing , in response to the speech signal and at said regular time intervals , pitch information including a pitch gain for constricting a past-excitation-signal component (maximum amplitude) added to said excitation signal ;
and an instability eradication section comprising : detecting means for detecting a set of conditions related to the spectral parameters and the pitch gain ;
and modifying means for reducing the pitch gain to a value lower than a given threshold whenever the conditions of said set are detected in order to eradicate said occasional instability .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (signal component) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5987406A
CLAIM 1
. A cellular network element comprising (a) a transmitter including analysis-by-synthesis encoding means for encoding a speech signal and means for transmitting the encoded speech signal , and (b) a receiver including means for receiving a transmitted encoded speech signal and means for decoding the received encoded speech signal ;
wherein the analysis-by-synthesis speech signal encoding means of the transmitter is provided with an encoder system comprising : an analysis-by-synthesis encoder section for encoding the speech signal , comprising : first means for producing , in response to the speech signal and at regular time intervals called frames , a description of an innovation signal to be supplied as excitation signal to a synthesis filter in view of synthesizing said speech signal ;
second means for producing , in response to the speech signal and at said regular time intervals , a set of spectral parameters for use in driving the synthesis filter ;
and third means for producing , in response to the speech signal and at said regular time intervals , pitch information including a pitch gain for constricting a past-excitation-signal component (maximum amplitude) added to said excitation signal ;
and an instability eradication section comprising : detecting means for detecting a set of conditions related to the spectral parameters and the pitch gain ;
and modifying means for reducing the pitch gain to a value lower than a given threshold whenever the conditions of said set are detected in order to eradicate said occasional instability .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (high value) and the first non erased frame received after frame erasure is encoded as active speech .
US5987406A
CLAIM 2
. A cellular network element as recited in claim 1 , wherein the conditions of said set comprise : a resonance condition assessed from the spectral parameters ;
a duration condition detected when the resonance condition has prevailed for at least the M most recent frames , M being an integer greater than 1 ;
and a gain condition which evidences consistently-high value (comfort noise) s of the pitch gain in the N most recent frames , N being an integer greater than 1 .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (signal component) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5987406A
CLAIM 1
. A cellular network element comprising (a) a transmitter including analysis-by-synthesis encoding means for encoding a speech signal and means for transmitting the encoded speech signal , and (b) a receiver including means for receiving a transmitted encoded speech signal and means for decoding the received encoded speech signal ;
wherein the analysis-by-synthesis speech signal encoding means of the transmitter is provided with an encoder system comprising : an analysis-by-synthesis encoder section for encoding the speech signal , comprising : first means for producing , in response to the speech signal and at regular time intervals called frames , a description of an innovation signal to be supplied as excitation signal to a synthesis filter in view of synthesizing said speech signal ;
second means for producing , in response to the speech signal and at said regular time intervals , a set of spectral parameters for use in driving the synthesis filter ;
and third means for producing , in response to the speech signal and at said regular time intervals , pitch information including a pitch gain for constricting a past-excitation-signal component (maximum amplitude) added to said excitation signal ;
and an instability eradication section comprising : detecting means for detecting a set of conditions related to the spectral parameters and the pitch gain ;
and modifying means for reducing the pitch gain to a value lower than a given threshold whenever the conditions of said set are detected in order to eradicate said occasional instability .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6311154B1

Filed: 1998-12-30     Issued: 2001-10-30

Adaptive windows for analysis-by-synthesis CELP-type speech coding

(Original Assignee) Nokia Mobile Phones Ltd     (Current Assignee) Nokia Mobile Phones Ltd ; Microsoft Technology Licensing LLC

Allen Gersho, Vladimir Cuperman, Ajit V Rao, Tung-Chiang Yang, Sassan Ahmadi, Fenghua Liu
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (encoded speech signal, pitch period) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US6311154B1
CLAIM 25
. A wireless voice communicator , comprising ;
a wireless transceiver comprising a transmitter and a receiver ;
an input speech transducer and an output speech transducer ;
and a speech processor comprising , a sampling and framing unit having an input coupled to an output of said input speech transducer for partitioning samples of an input speech signal into frames ;
a first classifier for classifying a frame as being one of an unvoiced frame or a not unvoiced frame and a second classifier for classifying said not unvoiced frame as being one or a voiced frame or a transition frame ;
a windowing unit for determining the location of at least one window in a frame ;
and an encoder for providing an encoded speech signal (decoder determines concealment, decoder concealment, pitch period) where , in an excitation for the frame , all or substantially all of non-zero excitation amplitudes lie within the at least one window ;
said wireless communicator further comprising a modulator for modulating a carrier with the encoded speech signal , said modulator having an output coupled to an input of said transmitter ;
a demodulator having an input coupled to an output of said receiver for demodulating a carrier that is encoded with a speech signal and that was transmitted from a remote transmitter ;
and said speech processor further comprising a decoder having an input coupled to an output of said demodulator for decoding an excitation from a frame wherein all or substantially all of non-zero excitation amolitudes lie within at least one window , said decoder having an output coupled to an input of said output speech transducer .

US6311154B1
CLAIM 32
. A wireless communicator as in claim 31 , wherein the predetermined distance is one pitch period (decoder determines concealment, decoder concealment, pitch period) , and wherein the jitter value is an integer between about −8 and about +7 .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (encoded speech signal, pitch period) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US6311154B1
CLAIM 25
. A wireless voice communicator , comprising ;
a wireless transceiver comprising a transmitter and a receiver ;
an input speech transducer and an output speech transducer ;
and a speech processor comprising , a sampling and framing unit having an input coupled to an output of said input speech transducer for partitioning samples of an input speech signal into frames ;
a first classifier for classifying a frame as being one of an unvoiced frame or a not unvoiced frame and a second classifier for classifying said not unvoiced frame as being one or a voiced frame or a transition frame ;
a windowing unit for determining the location of at least one window in a frame ;
and an encoder for providing an encoded speech signal (decoder determines concealment, decoder concealment, pitch period) where , in an excitation for the frame , all or substantially all of non-zero excitation amplitudes lie within the at least one window ;
said wireless communicator further comprising a modulator for modulating a carrier with the encoded speech signal , said modulator having an output coupled to an input of said transmitter ;
a demodulator having an input coupled to an output of said receiver for demodulating a carrier that is encoded with a speech signal and that was transmitted from a remote transmitter ;
and said speech processor further comprising a decoder having an input coupled to an output of said demodulator for decoding an excitation from a frame wherein all or substantially all of non-zero excitation amolitudes lie within at least one window , said decoder having an output coupled to an input of said output speech transducer .

US6311154B1
CLAIM 32
. A wireless communicator as in claim 31 , wherein the predetermined distance is one pitch period (decoder determines concealment, decoder concealment, pitch period) , and wherein the jitter value is an integer between about −8 and about +7 .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (fixed codebook) per sample for other frames .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US6311154B1
CLAIM 34
. A speech decoder , comprising : a class decoder having an input coupled to an input node of said speech decoder for extracting from an input bit stream predetermined ones of bits encoding class information for an encoded speech signal frame and for decoding the class information , wherein there are a plurality of predetermined classes ;
said plurality of predetermined classes comprises a voiced class , an unvoiced class and a transition class ;
and wherein said input bit stream is also coupled to an input of a LSP decoder ;
a first multi-position switch element controlled by an output of said class decoder for directing said input bit stream to an input of one of selected one of a plurality of excitation generators , an individual one of said excitation generators corresponding to one of said plurality of predetermined classes ;
a second multi-position switch element controlled by said output of said class decoder for coupling an output of the selected one of said excitation generators to an input of a synthesizer filter and , via a feedback path , also to said adaptive code book ;
an unvoiced class excitation generator and a transition class excitation generator coupled between said first and second multi-position switch elements ;
wherein for said transition class , at least one window position is decoded in a window decoder having an input coupled to said input bit stream ;
and wherein a codebook vector is retrieved from a transition excitation fixed codebook (average energy) using information concerning the at least one window location output from said window decoder and by multiplying a retrieved codebook vector ;
and wherein for said voiced class , the input bit stream encodes pitch information for the encoded speech signal frame which is decoded in a pitch decoder block having an output coupled to a window generator block that generates at least one window based on the decoded pitch information , said at least one window being used to retrieve , from an adaptive code book , an adaptive code book vector used for generating an excitation vector which is multiplied by a gain element and added to an adaptive codebook excitation to give a total excitation for a voiced frame .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (codebook excitation) and the first non erased frame received after frame erasure is encoded as active speech .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US6311154B1
CLAIM 34
. A speech decoder , comprising : a class decoder having an input coupled to an input node of said speech decoder for extracting from an input bit stream predetermined ones of bits encoding class information for an encoded speech signal frame and for decoding the class information , wherein there are a plurality of predetermined classes ;
said plurality of predetermined classes comprises a voiced class , an unvoiced class and a transition class ;
and wherein said input bit stream is also coupled to an input of a LSP decoder ;
a first multi-position switch element controlled by an output of said class decoder for directing said input bit stream to an input of one of selected one of a plurality of excitation generators , an individual one of said excitation generators corresponding to one of said plurality of predetermined classes ;
a second multi-position switch element controlled by said output of said class decoder for coupling an output of the selected one of said excitation generators to an input of a synthesizer filter and , via a feedback path , also to said adaptive code book ;
an unvoiced class excitation generator and a transition class excitation generator coupled between said first and second multi-position switch elements ;
wherein for said transition class , at least one window position is decoded in a window decoder having an input coupled to said input bit stream ;
and wherein a codebook vector is retrieved from a transition excitation fixed codebook using information concerning the at least one window location output from said window decoder and by multiplying a retrieved codebook vector ;
and wherein for said voiced class , the input bit stream encodes pitch information for the encoded speech signal frame which is decoded in a pitch decoder block having an output coupled to a window generator block that generates at least one window based on the decoded pitch information , said at least one window being used to retrieve , from an adaptive code book , an adaptive code book vector used for generating an excitation vector which is multiplied by a gain element and added to an adaptive codebook excitation (comfort noise) to give a total excitation for a voiced frame .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (encoded speech signal, pitch period) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US6311154B1
CLAIM 25
. A wireless voice communicator , comprising ;
a wireless transceiver comprising a transmitter and a receiver ;
an input speech transducer and an output speech transducer ;
and a speech processor comprising , a sampling and framing unit having an input coupled to an output of said input speech transducer for partitioning samples of an input speech signal into frames ;
a first classifier for classifying a frame as being one of an unvoiced frame or a not unvoiced frame and a second classifier for classifying said not unvoiced frame as being one or a voiced frame or a transition frame ;
a windowing unit for determining the location of at least one window in a frame ;
and an encoder for providing an encoded speech signal (decoder determines concealment, decoder concealment, pitch period) where , in an excitation for the frame , all or substantially all of non-zero excitation amplitudes lie within the at least one window ;
said wireless communicator further comprising a modulator for modulating a carrier with the encoded speech signal , said modulator having an output coupled to an input of said transmitter ;
a demodulator having an input coupled to an output of said receiver for demodulating a carrier that is encoded with a speech signal and that was transmitted from a remote transmitter ;
and said speech processor further comprising a decoder having an input coupled to an output of said demodulator for decoding an excitation from a frame wherein all or substantially all of non-zero excitation amolitudes lie within at least one window , said decoder having an output coupled to an input of said output speech transducer .

US6311154B1
CLAIM 32
. A wireless communicator as in claim 31 , wherein the predetermined distance is one pitch period (decoder determines concealment, decoder concealment, pitch period) , and wherein the jitter value is an integer between about −8 and about +7 .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (encoded speech signal, pitch period) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US6311154B1
CLAIM 25
. A wireless voice communicator , comprising ;
a wireless transceiver comprising a transmitter and a receiver ;
an input speech transducer and an output speech transducer ;
and a speech processor comprising , a sampling and framing unit having an input coupled to an output of said input speech transducer for partitioning samples of an input speech signal into frames ;
a first classifier for classifying a frame as being one of an unvoiced frame or a not unvoiced frame and a second classifier for classifying said not unvoiced frame as being one or a voiced frame or a transition frame ;
a windowing unit for determining the location of at least one window in a frame ;
and an encoder for providing an encoded speech signal (decoder determines concealment, decoder concealment, pitch period) where , in an excitation for the frame , all or substantially all of non-zero excitation amplitudes lie within the at least one window ;
said wireless communicator further comprising a modulator for modulating a carrier with the encoded speech signal , said modulator having an output coupled to an input of said transmitter ;
a demodulator having an input coupled to an output of said receiver for demodulating a carrier that is encoded with a speech signal and that was transmitted from a remote transmitter ;
and said speech processor further comprising a decoder having an input coupled to an output of said demodulator for decoding an excitation from a frame wherein all or substantially all of non-zero excitation amolitudes lie within at least one window , said decoder having an output coupled to an input of said output speech transducer .

US6311154B1
CLAIM 32
. A wireless communicator as in claim 31 , wherein the predetermined distance is one pitch period (decoder determines concealment, decoder concealment, pitch period) , and wherein the jitter value is an integer between about −8 and about +7 .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (encoded speech signal, pitch period) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US6311154B1
CLAIM 25
. A wireless voice communicator , comprising ;
a wireless transceiver comprising a transmitter and a receiver ;
an input speech transducer and an output speech transducer ;
and a speech processor comprising , a sampling and framing unit having an input coupled to an output of said input speech transducer for partitioning samples of an input speech signal into frames ;
a first classifier for classifying a frame as being one of an unvoiced frame or a not unvoiced frame and a second classifier for classifying said not unvoiced frame as being one or a voiced frame or a transition frame ;
a windowing unit for determining the location of at least one window in a frame ;
and an encoder for providing an encoded speech signal (decoder determines concealment, decoder concealment, pitch period) where , in an excitation for the frame , all or substantially all of non-zero excitation amplitudes lie within the at least one window ;
said wireless communicator further comprising a modulator for modulating a carrier with the encoded speech signal , said modulator having an output coupled to an input of said transmitter ;
a demodulator having an input coupled to an output of said receiver for demodulating a carrier that is encoded with a speech signal and that was transmitted from a remote transmitter ;
and said speech processor further comprising a decoder having an input coupled to an output of said demodulator for decoding an excitation from a frame wherein all or substantially all of non-zero excitation amolitudes lie within at least one window , said decoder having an output coupled to an input of said output speech transducer .

US6311154B1
CLAIM 32
. A wireless communicator as in claim 31 , wherein the predetermined distance is one pitch period (decoder determines concealment, decoder concealment, pitch period) , and wherein the jitter value is an integer between about −8 and about +7 .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US6311154B1
CLAIM 34
. A speech decoder , comprising : a class decoder having an input coupled to an input node of said speech decoder for extracting from an input bit stream predetermined ones of bits encoding class information for an encoded speech signal frame and for decoding the class information , wherein there are a plurality of predetermined classes ;
said plurality of predetermined classes comprises a voiced class , an unvoiced class and a transition class ;
and wherein said input bit stream is also coupled to an input of a LSP decoder ;
a first multi-position switch element controlled by an output of said class decoder for directing said input bit stream to an input of one of selected one of a plurality of excitation generators , an individual one of said excitation generators corresponding to one of said plurality of predetermined classes ;
a second multi-position switch element controlled by said output of said class decoder for coupling an output of the selected one of said excitation generators to an input of a synthesizer filter and , via a feedback path , also to said adaptive code book ;
an unvoiced class excitation generator and a transition class excitation generator coupled between said first and second multi-position switch elements ;
wherein for said transition class , at least one window position is decoded in a window decoder having an input coupled to said input bit stream ;
and wherein a codebook vector is retrieved from a transition excitation fixed codebook (average energy) using information concerning the at least one window location output from said window decoder and by multiplying a retrieved codebook vector ;
and wherein for said voiced class , the input bit stream encodes pitch information for the encoded speech signal frame which is decoded in a pitch decoder block having an output coupled to a window generator block that generates at least one window based on the decoded pitch information , said at least one window being used to retrieve , from an adaptive code book , an adaptive code book vector used for generating an excitation vector which is multiplied by a gain element and added to an adaptive codebook excitation to give a total excitation for a voiced frame .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (codebook excitation) and the first non erased frame received after frame erasure is encoded as active speech .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US6311154B1
CLAIM 34
. A speech decoder , comprising : a class decoder having an input coupled to an input node of said speech decoder for extracting from an input bit stream predetermined ones of bits encoding class information for an encoded speech signal frame and for decoding the class information , wherein there are a plurality of predetermined classes ;
said plurality of predetermined classes comprises a voiced class , an unvoiced class and a transition class ;
and wherein said input bit stream is also coupled to an input of a LSP decoder ;
a first multi-position switch element controlled by an output of said class decoder for directing said input bit stream to an input of one of selected one of a plurality of excitation generators , an individual one of said excitation generators corresponding to one of said plurality of predetermined classes ;
a second multi-position switch element controlled by said output of said class decoder for coupling an output of the selected one of said excitation generators to an input of a synthesizer filter and , via a feedback path , also to said adaptive code book ;
an unvoiced class excitation generator and a transition class excitation generator coupled between said first and second multi-position switch elements ;
wherein for said transition class , at least one window position is decoded in a window decoder having an input coupled to said input bit stream ;
and wherein a codebook vector is retrieved from a transition excitation fixed codebook using information concerning the at least one window location output from said window decoder and by multiplying a retrieved codebook vector ;
and wherein for said voiced class , the input bit stream encodes pitch information for the encoded speech signal frame which is decoded in a pitch decoder block having an output coupled to a window generator block that generates at least one window based on the decoded pitch information , said at least one window being used to retrieve , from an adaptive code book , an adaptive code book vector used for generating an excitation vector which is multiplied by a gain element and added to an adaptive codebook excitation (comfort noise) to give a total excitation for a voiced frame .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (encoded speech signal, pitch period) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US6311154B1
CLAIM 25
. A wireless voice communicator , comprising ;
a wireless transceiver comprising a transmitter and a receiver ;
an input speech transducer and an output speech transducer ;
and a speech processor comprising , a sampling and framing unit having an input coupled to an output of said input speech transducer for partitioning samples of an input speech signal into frames ;
a first classifier for classifying a frame as being one of an unvoiced frame or a not unvoiced frame and a second classifier for classifying said not unvoiced frame as being one or a voiced frame or a transition frame ;
a windowing unit for determining the location of at least one window in a frame ;
and an encoder for providing an encoded speech signal (decoder determines concealment, decoder concealment, pitch period) where , in an excitation for the frame , all or substantially all of non-zero excitation amplitudes lie within the at least one window ;
said wireless communicator further comprising a modulator for modulating a carrier with the encoded speech signal , said modulator having an output coupled to an input of said transmitter ;
a demodulator having an input coupled to an output of said receiver for demodulating a carrier that is encoded with a speech signal and that was transmitted from a remote transmitter ;
and said speech processor further comprising a decoder having an input coupled to an output of said demodulator for decoding an excitation from a frame wherein all or substantially all of non-zero excitation amolitudes lie within at least one window , said decoder having an output coupled to an input of said output speech transducer .

US6311154B1
CLAIM 32
. A wireless communicator as in claim 31 , wherein the predetermined distance is one pitch period (decoder determines concealment, decoder concealment, pitch period) , and wherein the jitter value is an integer between about −8 and about +7 .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .

US6311154B1
CLAIM 34
. A speech decoder , comprising : a class decoder having an input coupled to an input node of said speech decoder for extracting from an input bit stream predetermined ones of bits encoding class information for an encoded speech signal frame and for decoding the class information , wherein there are a plurality of predetermined classes ;
said plurality of predetermined classes comprises a voiced class , an unvoiced class and a transition class ;
and wherein said input bit stream is also coupled to an input of a LSP decoder ;
a first multi-position switch element controlled by an output of said class decoder for directing said input bit stream to an input of one of selected one of a plurality of excitation generators , an individual one of said excitation generators corresponding to one of said plurality of predetermined classes ;
a second multi-position switch element controlled by said output of said class decoder for coupling an output of the selected one of said excitation generators to an input of a synthesizer filter and , via a feedback path , also to said adaptive code book ;
an unvoiced class excitation generator and a transition class excitation generator coupled between said first and second multi-position switch elements ;
wherein for said transition class , at least one window position is decoded in a window decoder having an input coupled to said input bit stream ;
and wherein a codebook vector is retrieved from a transition excitation fixed codebook (average energy) using information concerning the at least one window location output from said window decoder and by multiplying a retrieved codebook vector ;
and wherein for said voiced class , the input bit stream encodes pitch information for the encoded speech signal frame which is decoded in a pitch decoder block having an output coupled to a window generator block that generates at least one window based on the decoded pitch information , said at least one window being used to retrieve , from an adaptive code book , an adaptive code book vector used for generating an excitation vector which is multiplied by a gain element and added to an adaptive codebook excitation to give a total excitation for a voiced frame .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6311154B1
CLAIM 12
. A method as in claim 8 , wherein one of the plurality of codebooks is comprised of an adaptive codebook (sound signal, speech signal) .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH11259098A

Filed: 1998-12-24     Issued: 1999-09-24

音声符号化/復号化方法

(Original Assignee) Toshiba Corp; 株式会社東芝     

Ko Amada, Kimio Miseki, 公生 三関, 皇 天田
US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

JPH11259098A
CLAIM 11
【請求項11】音声信号の性質に応じて適応的に変化す るパルス位置候補から選ばれた所定の数のパルス位置に パルスを配置することで生成されたパルス列を含んで構 成される駆動信号を合成フィルタに入力して音声信号を 復号化することを特徴とする音声復号 (speech signal) 化方法。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JPH11259098A
CLAIM 11
【請求項11】音声信号の性質に応じて適応的に変化す るパルス位置候補から選ばれた所定の数のパルス位置に パルスを配置することで生成されたパルス列を含んで構 成される駆動信号を合成フィルタに入力して音声信号を 復号化することを特徴とする音声復号 (speech signal) 化方法。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH11259098A
CLAIM 11
【請求項11】音声信号の性質に応じて適応的に変化す るパルス位置候補から選ばれた所定の数のパルス位置に パルスを配置することで生成されたパルス列を含んで構 成される駆動信号を合成フィルタに入力して音声信号を 復号化することを特徴とする音声復号 (speech signal) 化方法。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

JPH11259098A
CLAIM 11
【請求項11】音声信号の性質に応じて適応的に変化す るパルス位置候補から選ばれた所定の数のパルス位置に パルスを配置することで生成されたパルス列を含んで構 成される駆動信号を合成フィルタに入力して音声信号を 復号化することを特徴とする音声復号 (speech signal) 化方法。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JPH11259098A
CLAIM 11
【請求項11】音声信号の性質に応じて適応的に変化す るパルス位置候補から選ばれた所定の数のパルス位置に パルスを配置することで生成されたパルス列を含んで構 成される駆動信号を合成フィルタに入力して音声信号を 復号化することを特徴とする音声復号 (speech signal) 化方法。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH11259098A
CLAIM 11
【請求項11】音声信号の性質に応じて適応的に変化す るパルス位置候補から選ばれた所定の数のパルス位置に パルスを配置することで生成されたパルス列を含んで構 成される駆動信号を合成フィルタに入力して音声信号を 復号化することを特徴とする音声復号 (speech signal) 化方法。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。

JPH11259098A
CLAIM 11
【請求項11】音声信号の性質に応じて適応的に変化す るパルス位置候補から選ばれた所定の数のパルス位置に パルスを配置することで生成されたパルス列を含んで構 成される駆動信号を合成フィルタに入力して音声信号を 復号化することを特徴とする音声復号 (speech signal) 化方法。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JPH11259098A
CLAIM 3
【請求項3】音声信号を少なくとも合成フィルタの特性 を表す情報と該合成フィルタを駆動するための駆動信号 とで表現して符号化する音声符号化方法において、 前記駆動信号は、前記音声信号の性質に応じて適応的に 変化するパルス位置候補から選ばれた所定の数のパルス 位置にパルスを配置し、各パルスの振幅は所定の手段で 最適化 (energy information parameter) することで生成されたパルス列を含んで構成され ることを特徴とする音声符号化方法。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6385576B2

Filed: 1998-12-23     Issued: 2002-05-07

Speech encoding/decoding method using reduced subframe pulse positions having density related to pitch

(Original Assignee) Toshiba Corp     (Current Assignee) Toshiba Corp

Tadashi Amada, Kimio Miseki
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy (large power) of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 4
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook containing past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being high where the pitch vector has a large power (controlling energy) and decreasing in accordance with a decrease in the power ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (decoding apparatus) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs (decoding apparatus) , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (decoding apparatus) , limits to a given value a gain used for scaling the synthesized sound signal .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery (decoding apparatus) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6385576B2
CLAIM 1
. A speech encoding method comprising : generating information representing characteristics of a synthesis filter based on an input speech signal in units of one frame ;
generating a pitch vector from an adaptive codebook (sound signal, speech signal) containing a plurality of past excitation signals ;
generating a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
and selecting a second number of pulse positions from the reduced pulse position candidates to generate a pulse train having a plurality of pulses located at a plurality of pulse positions corresponding to a second number of pulse positions under the criterion of minimizing an error between the input speech signal and a synthesis signal which is an output of the synthesis filter whose input is an excitation signal generated by adding the pitch vector and the pulse train .

US6385576B2
CLAIM 18
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : a receiver configured to receive an encoded bit stream containing indices relative to a synthesis filter in units of one frame , and a pitch vector and a pulse train in units of one sub-frame ;
a first generator configured to generate the synthesis filter and the pitch vector depending on the indices ;
a second generator configured to generate a first number of reduced pulse position candidates by selecting a first number of pulse positions from a number of possible pulse positions in the sub-frame , a density of the reduced pulse position candidates being changed in accordance with a shape of the pitch vector ;
a third generator configured to generate a second number of pulse positions from the first number of reduced pulse position candidates based on the indices ;
a fourth generator configured to generate a pulse train having plurality of pulses located at a plurality of pulse positions corresponding to the second number of pulse positions ;
a fifth generator configured to generate an excitation signal including the pitch vector and the pulse train ;
and an input device configured to input the excitation signal to a synthesis filter for reconstructing a speech signal .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6226606B1

Filed: 1998-11-24     Issued: 2001-05-01

Method and apparatus for pitch tracking

(Original Assignee) Microsoft Corp     (Current Assignee) Zhigu Holdings Ltd

Alejandro Acero, James G. Droppo, III
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (time window) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6226606B1
CLAIM 1
. A method for tracking pitch in a speech signal , the method comprising : sampling the speech signal across a first time window (impulse responses) that is centered at a first time mark to produce a first window vector ;
sampling the speech signal across a second time window that is centered at a second time mark to produce a second window vector , the second time mark separated from the first time mark by a test pitch period ;
calculating an energy value indicative of the energy of the portion of the speech signal represented by the first window vector ;
calculating a cross-correlation value based on the first window vector and the second window vector ;
combining the energy value and the cross-correlation value to produce a predictable energy factor ;
determining a pitch score for the test pitch period based in part on the predictable energy factor ;
and identifying at least a portion of a pitch track based in part on the pitch score .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i, steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6226606B1
CLAIM 14
. The method of claim 13 wherein identifying the pitch track comprises i (LP filter) dentifying the pitch track associated with the highest pitch track score .

US6226606B1
CLAIM 20
. A method for tracking pitch in a speech signal , the method comprising : sampling a first waveform in the speech signal ;
sampling a second waveform in the speech signal , the center of the first waveform separated from the center of the second waveform by a test pitch period ;
creating a correlation value indicative of the degree of similarity between the first waveform and the second waveform through steps c (LP filter) omprising : determining the cross-correlation between the first waveform and the second waveform ;
determining the energy of the first waveform ;
and multiplying the cross-correlation by the energy to produce the correlation value ;
creating a pitch-contouring factor indicative of the similarity between the test pitch period and a previous pitch period ;
combining the correlation value and the pitch-contouring factor to produce a pitch score for transitioning from the previous pitch period to the test pitch period ;
and identifying a portion of a pitch track based on at least one pitch score .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (comprises i, steps c) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6226606B1
CLAIM 14
. The method of claim 13 wherein identifying the pitch track comprises i (LP filter) dentifying the pitch track associated with the highest pitch track score .

US6226606B1
CLAIM 20
. A method for tracking pitch in a speech signal , the method comprising : sampling a first waveform in the speech signal ;
sampling a second waveform in the speech signal , the center of the first waveform separated from the center of the second waveform by a test pitch period ;
creating a correlation value indicative of the degree of similarity between the first waveform and the second waveform through steps c (LP filter) omprising : determining the cross-correlation between the first waveform and the second waveform ;
determining the energy of the first waveform ;
and multiplying the cross-correlation by the energy to produce the correlation value ;
creating a pitch-contouring factor indicative of the similarity between the test pitch period and a previous pitch period ;
combining the correlation value and the pitch-contouring factor to produce a pitch score for transitioning from the previous pitch period to the test pitch period ;
and identifying a portion of a pitch track based on at least one pitch score .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i, steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6226606B1
CLAIM 14
. The method of claim 13 wherein identifying the pitch track comprises i (LP filter) dentifying the pitch track associated with the highest pitch track score .

US6226606B1
CLAIM 20
. A method for tracking pitch in a speech signal , the method comprising : sampling a first waveform in the speech signal ;
sampling a second waveform in the speech signal , the center of the first waveform separated from the center of the second waveform by a test pitch period ;
creating a correlation value indicative of the degree of similarity between the first waveform and the second waveform through steps c (LP filter) omprising : determining the cross-correlation between the first waveform and the second waveform ;
determining the energy of the first waveform ;
and multiplying the cross-correlation by the energy to produce the correlation value ;
creating a pitch-contouring factor indicative of the similarity between the test pitch period and a previous pitch period ;
combining the correlation value and the pitch-contouring factor to produce a pitch score for transitioning from the previous pitch period to the test pitch period ;
and identifying a portion of a pitch track based on at least one pitch score .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (time window) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6226606B1
CLAIM 1
. A method for tracking pitch in a speech signal , the method comprising : sampling the speech signal across a first time window (impulse responses) that is centered at a first time mark to produce a first window vector ;
sampling the speech signal across a second time window that is centered at a second time mark to produce a second window vector , the second time mark separated from the first time mark by a test pitch period ;
calculating an energy value indicative of the energy of the portion of the speech signal represented by the first window vector ;
calculating a cross-correlation value based on the first window vector and the second window vector ;
combining the energy value and the cross-correlation value to produce a predictable energy factor ;
determining a pitch score for the test pitch period based in part on the predictable energy factor ;
and identifying at least a portion of a pitch track based in part on the pitch score .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (comprises i, steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6226606B1
CLAIM 14
. The method of claim 13 wherein identifying the pitch track comprises i (LP filter) dentifying the pitch track associated with the highest pitch track score .

US6226606B1
CLAIM 20
. A method for tracking pitch in a speech signal , the method comprising : sampling a first waveform in the speech signal ;
sampling a second waveform in the speech signal , the center of the first waveform separated from the center of the second waveform by a test pitch period ;
creating a correlation value indicative of the degree of similarity between the first waveform and the second waveform through steps c (LP filter) omprising : determining the cross-correlation between the first waveform and the second waveform ;
determining the energy of the first waveform ;
and multiplying the cross-correlation by the energy to produce the correlation value ;
creating a pitch-contouring factor indicative of the similarity between the test pitch period and a previous pitch period ;
combining the correlation value and the pitch-contouring factor to produce a pitch score for transitioning from the previous pitch period to the test pitch period ;
and identifying a portion of a pitch track based on at least one pitch score .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (comprises i, steps c) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6226606B1
CLAIM 14
. The method of claim 13 wherein identifying the pitch track comprises i (LP filter) dentifying the pitch track associated with the highest pitch track score .

US6226606B1
CLAIM 20
. A method for tracking pitch in a speech signal , the method comprising : sampling a first waveform in the speech signal ;
sampling a second waveform in the speech signal , the center of the first waveform separated from the center of the second waveform by a test pitch period ;
creating a correlation value indicative of the degree of similarity between the first waveform and the second waveform through steps c (LP filter) omprising : determining the cross-correlation between the first waveform and the second waveform ;
determining the energy of the first waveform ;
and multiplying the cross-correlation by the energy to produce the correlation value ;
creating a pitch-contouring factor indicative of the similarity between the test pitch period and a previous pitch period ;
combining the correlation value and the pitch-contouring factor to produce a pitch score for transitioning from the previous pitch period to the test pitch period ;
and identifying a portion of a pitch track based on at least one pitch score .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (comprises i, steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6226606B1
CLAIM 14
. The method of claim 13 wherein identifying the pitch track comprises i (LP filter) dentifying the pitch track associated with the highest pitch track score .

US6226606B1
CLAIM 20
. A method for tracking pitch in a speech signal , the method comprising : sampling a first waveform in the speech signal ;
sampling a second waveform in the speech signal , the center of the first waveform separated from the center of the second waveform by a test pitch period ;
creating a correlation value indicative of the degree of similarity between the first waveform and the second waveform through steps c (LP filter) omprising : determining the cross-correlation between the first waveform and the second waveform ;
determining the energy of the first waveform ;
and multiplying the cross-correlation by the energy to produce the correlation value ;
creating a pitch-contouring factor indicative of the similarity between the test pitch period and a previous pitch period ;
combining the correlation value and the pitch-contouring factor to produce a pitch score for transitioning from the previous pitch period to the test pitch period ;
and identifying a portion of a pitch track based on at least one pitch score .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6310915B1

Filed: 1998-11-20     Issued: 2001-10-30

Video transcoder with bitstream look ahead for rate control and statistical multiplexing

(Original Assignee) Harmonic Inc     (Current Assignee) LSI Corp ; Harmonic Inc ; Divicom Inc

Aaron Wells, Elliot Linzer
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (said model) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6310915B1
CLAIM 3
. The method of claim 1 wherein said step of re-encoding further comprising the steps of : altering a model of a decoder buffer to fill at a channel rate indicated by said encoding parameter , and determining a budget of a number of bits to generate while re-encoding said decoded picture which avoids overflowing and underflowing said model (first impulse) of said decoder buffer .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment (error concealment) and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment (error concealment) and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 13
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment (error concealment) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (said model) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6310915B1
CLAIM 3
. The method of claim 1 wherein said step of re-encoding further comprising the steps of : altering a model of a decoder buffer to fill at a channel rate indicated by said encoding parameter , and determining a budget of a number of bits to generate while re-encoding said decoded picture which avoids overflowing and underflowing said model (first impulse) of said decoder buffer .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 14
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 15
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 16
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 17
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment) and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment (error concealment) and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 20
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 22
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 23
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 24
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (error concealment) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment (error concealment) and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6310915B1
CLAIM 1
. A method for transcoding a previously encoded video signal to a second encoded representation (energy information parameter, phase information parameter) thereof comprising the steps of : (a) receiving k> ;
1 previously encoded pictures of a previously encoded video signal in a buffer , (b) scanning each of said k previously encoded pictures in said buffer to gather information on each of said k previously encoded pictures , (c) allocating an encoding parameter to one of said k previously encoded pictures , which precedes each other one of said k previously encoded pictures in encoded order of said previously encoded video signal , based on said information gathered for each of said k previously encoded pictures , (d) decoding said one previously encoded picture , to produce a decoded picture , (e) re-encoding said decoded picture , to generate a re-encoded picture in a second encoded representation of said video signal , said re-encoding being performed in a fashion which depends on said encoding parameter allocated thereto , (f) splicing an end of said previously encoded video signal together with a beginning of another encoded video signal for sequential transfer to said channel , wherein said step (a) further comprises the step of storing a picture at an end of said previously encoded video signal and at least one picture at a beginning of said another video signal in said look ahead buffer , wherein said step (c) further comprises the step of gathering information regarding said at least one encoded pictures at a beginning of said another encoded video signal and said picture at an end of said previously encoded video signal , and wherein in step (e) , said picture at an end of said previously encoded video signal is re-encoded in accordance with an encoding parameter allocated based on information gathered for said picture and for said pictures at a beginning of said another encoded video signal .

US6310915B1
CLAIM 11
. The method of claim 10 wherein said step of including comprises the step of retaining previously generated error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) motion vectors .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6373842B1

Filed: 1998-11-19     Issued: 2002-04-16

Unidirectional streaming services in wireless systems

(Original Assignee) Nortel Networks Ltd     (Current Assignee) Microsoft Technology Licensing LLC

Paul Coverdale, Leo Strawczynski
US7693710B2
CLAIM 1
. A method of concealing frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (data frames) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US6373842B1
CLAIM 24
. The method as claimed in claim 13 wherein control signal (maximum amplitude) s are transmitted and received on an associated control channel and further comprising the step of adjusting the output of frames in said buffer responsive to said control signals .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (said transmission, time t) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (data frames) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6373842B1
CLAIM 1
. A method of delivering unidirectional streaming services (USS) data transmitted via a wireless network , said method comprising the steps of : a) storing received USS frames in a receive buffer ;
b) testing received USS frames for errors ;
c) replacing , USS frames received with detected errors prior to delivery ;
and d) delivering said USS frames from said buffer at a constant rate responsive to a delay criteria being satisfied ;
wherein said replacing step comprises : i) requesting retransmission of said USS frames received with detected errors ;
ii) replacing the USS frames received with detected errors with retransmitted frames provided the retransmitted frames are received without errors in time t (first non, last non) o be delivered at said constant rate ;
and iii) if said retransmitted frames are not received without errors in time to be delivered at said constant rate , reconstructing the USS frames received with detected errors and discarding any subsequently received retransmission of said frames .

US6373842B1
CLAIM 3
. The method as claimed in claim 2 wherein frames are received at a transmission rate , wherein the average of said transmission (first non, last non) rate is faster than the constant rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (said transmission, time t) erased frame received after a frame erasure (data frames) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6373842B1
CLAIM 1
. A method of delivering unidirectional streaming services (USS) data transmitted via a wireless network , said method comprising the steps of : a) storing received USS frames in a receive buffer ;
b) testing received USS frames for errors ;
c) replacing , USS frames received with detected errors prior to delivery ;
and d) delivering said USS frames from said buffer at a constant rate responsive to a delay criteria being satisfied ;
wherein said replacing step comprises : i) requesting retransmission of said USS frames received with detected errors ;
ii) replacing the USS frames received with detected errors with retransmitted frames provided the retransmitted frames are received without errors in time t (first non, last non) o be delivered at said constant rate ;
and iii) if said retransmitted frames are not received without errors in time to be delivered at said constant rate , reconstructing the USS frames received with detected errors and discarding any subsequently received retransmission of said frames .

US6373842B1
CLAIM 3
. The method as claimed in claim 2 wherein frames are received at a transmission rate , wherein the average of said transmission (first non, last non) rate is faster than the constant rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (said transmission, time t) erased frame received after frame erasure (data frames) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (said transmission, time t) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6373842B1
CLAIM 1
. A method of delivering unidirectional streaming services (USS) data transmitted via a wireless network , said method comprising the steps of : a) storing received USS frames in a receive buffer ;
b) testing received USS frames for errors ;
c) replacing , USS frames received with detected errors prior to delivery ;
and d) delivering said USS frames from said buffer at a constant rate responsive to a delay criteria being satisfied ;
wherein said replacing step comprises : i) requesting retransmission of said USS frames received with detected errors ;
ii) replacing the USS frames received with detected errors with retransmitted frames provided the retransmitted frames are received without errors in time t (first non, last non) o be delivered at said constant rate ;
and iii) if said retransmitted frames are not received without errors in time to be delivered at said constant rate , reconstructing the USS frames received with detected errors and discarding any subsequently received retransmission of said frames .

US6373842B1
CLAIM 3
. The method as claimed in claim 2 wherein frames are received at a transmission rate , wherein the average of said transmission (first non, last non) rate is faster than the constant rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non (said transmission, time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (data frames) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6373842B1
CLAIM 1
. A method of delivering unidirectional streaming services (USS) data transmitted via a wireless network , said method comprising the steps of : a) storing received USS frames in a receive buffer ;
b) testing received USS frames for errors ;
c) replacing , USS frames received with detected errors prior to delivery ;
and d) delivering said USS frames from said buffer at a constant rate responsive to a delay criteria being satisfied ;
wherein said replacing step comprises : i) requesting retransmission of said USS frames received with detected errors ;
ii) replacing the USS frames received with detected errors with retransmitted frames provided the retransmitted frames are received without errors in time t (first non, last non) o be delivered at said constant rate ;
and iii) if said retransmitted frames are not received without errors in time to be delivered at said constant rate , reconstructing the USS frames received with detected errors and discarding any subsequently received retransmission of said frames .

US6373842B1
CLAIM 3
. The method as claimed in claim 2 wherein frames are received at a transmission rate , wherein the average of said transmission (first non, last non) rate is faster than the constant rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US6373842B1
CLAIM 22
. The method as claimed in claim 21 wherein said controlling step comprises i (LP filter) ncreasing the transmission rate if the overall traffic load allows extra bandwidth and decreasing the transmission rate if the overall traffic load is short of bandwidth .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non (said transmission, time t) erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (data frames) , E LPO is an energy of an impulse response of the LP filter of a last non (said transmission, time t) erased frame received before the frame erasure (data frames) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6373842B1
CLAIM 1
. A method of delivering unidirectional streaming services (USS) data transmitted via a wireless network , said method comprising the steps of : a) storing received USS frames in a receive buffer ;
b) testing received USS frames for errors ;
c) replacing , USS frames received with detected errors prior to delivery ;
and d) delivering said USS frames from said buffer at a constant rate responsive to a delay criteria being satisfied ;
wherein said replacing step comprises : i) requesting retransmission of said USS frames received with detected errors ;
ii) replacing the USS frames received with detected errors with retransmitted frames provided the retransmitted frames are received without errors in time t (first non, last non) o be delivered at said constant rate ;
and iii) if said retransmitted frames are not received without errors in time to be delivered at said constant rate , reconstructing the USS frames received with detected errors and discarding any subsequently received retransmission of said frames .

US6373842B1
CLAIM 3
. The method as claimed in claim 2 wherein frames are received at a transmission rate , wherein the average of said transmission (first non, last non) rate is faster than the constant rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US6373842B1
CLAIM 22
. The method as claimed in claim 21 wherein said controlling step comprises i (LP filter) ncreasing the transmission rate if the overall traffic load allows extra bandwidth and decreasing the transmission rate if the overall traffic load is short of bandwidth .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US6373842B1
CLAIM 24
. The method as claimed in claim 13 wherein control signal (maximum amplitude) s are transmitted and received on an associated control channel and further comprising the step of adjusting the output of frames in said buffer responsive to said control signals .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (data frames) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (data frames) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non (said transmission, time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (data frames) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (data frames) , E LPO is an energy of an impulse response of the LP filter of a last non (said transmission, time t) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6373842B1
CLAIM 1
. A method of delivering unidirectional streaming services (USS) data transmitted via a wireless network , said method comprising the steps of : a) storing received USS frames in a receive buffer ;
b) testing received USS frames for errors ;
c) replacing , USS frames received with detected errors prior to delivery ;
and d) delivering said USS frames from said buffer at a constant rate responsive to a delay criteria being satisfied ;
wherein said replacing step comprises : i) requesting retransmission of said USS frames received with detected errors ;
ii) replacing the USS frames received with detected errors with retransmitted frames provided the retransmitted frames are received without errors in time t (first non, last non) o be delivered at said constant rate ;
and iii) if said retransmitted frames are not received without errors in time to be delivered at said constant rate , reconstructing the USS frames received with detected errors and discarding any subsequently received retransmission of said frames .

US6373842B1
CLAIM 3
. The method as claimed in claim 2 wherein frames are received at a transmission rate , wherein the average of said transmission (first non, last non) rate is faster than the constant rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US6373842B1
CLAIM 22
. The method as claimed in claim 21 wherein said controlling step comprises i (LP filter) ncreasing the transmission rate if the overall traffic load allows extra bandwidth and decreasing the transmission rate if the overall traffic load is short of bandwidth .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link (communication link) for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (data frames) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6373842B1
CLAIM 7
. A method of delivering unidirectional streaming services (USS) data comprising the steps of : a) determining whether said USS data requires a symmetrical , bidirectional communication link (communication link) with a recipient user ;
and b) responsive to said determining step determining said USS data does not require such symmetrical , bidirectional communication link with a recipient user , introducing a buffer delay between receiving a number of frames and delivering said number of frames by : i) receiving in a buffer USS data transmitted with a variable transmission rate ;
and ii) delivering said USS data with a constant delivery rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (communication link) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6373842B1
CLAIM 7
. A method of delivering unidirectional streaming services (USS) data comprising the steps of : a) determining whether said USS data requires a symmetrical , bidirectional communication link (communication link) with a recipient user ;
and b) responsive to said determining step determining said USS data does not require such symmetrical , bidirectional communication link with a recipient user , introducing a buffer delay between receiving a number of frames and delivering said number of frames by : i) receiving in a buffer USS data transmitted with a variable transmission rate ;
and ii) delivering said USS data with a constant delivery rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (communication link) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6373842B1
CLAIM 7
. A method of delivering unidirectional streaming services (USS) data comprising the steps of : a) determining whether said USS data requires a symmetrical , bidirectional communication link (communication link) with a recipient user ;
and b) responsive to said determining step determining said USS data does not require such symmetrical , bidirectional communication link with a recipient user , introducing a buffer delay between receiving a number of frames and delivering said number of frames by : i) receiving in a buffer USS data transmitted with a variable transmission rate ;
and ii) delivering said USS data with a constant delivery rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US6373842B1
CLAIM 24
. The method as claimed in claim 13 wherein control signal (maximum amplitude) s are transmitted and received on an associated control channel and further comprising the step of adjusting the output of frames in said buffer responsive to said control signals .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (communication link) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6373842B1
CLAIM 7
. A method of delivering unidirectional streaming services (USS) data comprising the steps of : a) determining whether said USS data requires a symmetrical , bidirectional communication link (communication link) with a recipient user ;
and b) responsive to said determining step determining said USS data does not require such symmetrical , bidirectional communication link with a recipient user , introducing a buffer delay between receiving a number of frames and delivering said number of frames by : i) receiving in a buffer USS data transmitted with a variable transmission rate ;
and ii) delivering said USS data with a constant delivery rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (communication link) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (said transmission, time t) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (data frames) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6373842B1
CLAIM 1
. A method of delivering unidirectional streaming services (USS) data transmitted via a wireless network , said method comprising the steps of : a) storing received USS frames in a receive buffer ;
b) testing received USS frames for errors ;
c) replacing , USS frames received with detected errors prior to delivery ;
and d) delivering said USS frames from said buffer at a constant rate responsive to a delay criteria being satisfied ;
wherein said replacing step comprises : i) requesting retransmission of said USS frames received with detected errors ;
ii) replacing the USS frames received with detected errors with retransmitted frames provided the retransmitted frames are received without errors in time t (first non, last non) o be delivered at said constant rate ;
and iii) if said retransmitted frames are not received without errors in time to be delivered at said constant rate , reconstructing the USS frames received with detected errors and discarding any subsequently received retransmission of said frames .

US6373842B1
CLAIM 3
. The method as claimed in claim 2 wherein frames are received at a transmission rate , wherein the average of said transmission (first non, last non) rate is faster than the constant rate .

US6373842B1
CLAIM 7
. A method of delivering unidirectional streaming services (USS) data comprising the steps of : a) determining whether said USS data requires a symmetrical , bidirectional communication link (communication link) with a recipient user ;
and b) responsive to said determining step determining said USS data does not require such symmetrical , bidirectional communication link with a recipient user , introducing a buffer delay between receiving a number of frames and delivering said number of frames by : i) receiving in a buffer USS data transmitted with a variable transmission rate ;
and ii) delivering said USS data with a constant delivery rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (said transmission, time t) erased frame received following frame erasure (data frames) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6373842B1
CLAIM 1
. A method of delivering unidirectional streaming services (USS) data transmitted via a wireless network , said method comprising the steps of : a) storing received USS frames in a receive buffer ;
b) testing received USS frames for errors ;
c) replacing , USS frames received with detected errors prior to delivery ;
and d) delivering said USS frames from said buffer at a constant rate responsive to a delay criteria being satisfied ;
wherein said replacing step comprises : i) requesting retransmission of said USS frames received with detected errors ;
ii) replacing the USS frames received with detected errors with retransmitted frames provided the retransmitted frames are received without errors in time t (first non, last non) o be delivered at said constant rate ;
and iii) if said retransmitted frames are not received without errors in time to be delivered at said constant rate , reconstructing the USS frames received with detected errors and discarding any subsequently received retransmission of said frames .

US6373842B1
CLAIM 3
. The method as claimed in claim 2 wherein frames are received at a transmission rate , wherein the average of said transmission (first non, last non) rate is faster than the constant rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (said transmission, time t) erased frame received after frame erasure (data frames) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (said transmission, time t) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6373842B1
CLAIM 1
. A method of delivering unidirectional streaming services (USS) data transmitted via a wireless network , said method comprising the steps of : a) storing received USS frames in a receive buffer ;
b) testing received USS frames for errors ;
c) replacing , USS frames received with detected errors prior to delivery ;
and d) delivering said USS frames from said buffer at a constant rate responsive to a delay criteria being satisfied ;
wherein said replacing step comprises : i) requesting retransmission of said USS frames received with detected errors ;
ii) replacing the USS frames received with detected errors with retransmitted frames provided the retransmitted frames are received without errors in time t (first non, last non) o be delivered at said constant rate ;
and iii) if said retransmitted frames are not received without errors in time to be delivered at said constant rate , reconstructing the USS frames received with detected errors and discarding any subsequently received retransmission of said frames .

US6373842B1
CLAIM 3
. The method as claimed in claim 2 wherein frames are received at a transmission rate , wherein the average of said transmission (first non, last non) rate is faster than the constant rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (communication link) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (comprises i) of a first non (said transmission, time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (data frames) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6373842B1
CLAIM 1
. A method of delivering unidirectional streaming services (USS) data transmitted via a wireless network , said method comprising the steps of : a) storing received USS frames in a receive buffer ;
b) testing received USS frames for errors ;
c) replacing , USS frames received with detected errors prior to delivery ;
and d) delivering said USS frames from said buffer at a constant rate responsive to a delay criteria being satisfied ;
wherein said replacing step comprises : i) requesting retransmission of said USS frames received with detected errors ;
ii) replacing the USS frames received with detected errors with retransmitted frames provided the retransmitted frames are received without errors in time t (first non, last non) o be delivered at said constant rate ;
and iii) if said retransmitted frames are not received without errors in time to be delivered at said constant rate , reconstructing the USS frames received with detected errors and discarding any subsequently received retransmission of said frames .

US6373842B1
CLAIM 3
. The method as claimed in claim 2 wherein frames are received at a transmission rate , wherein the average of said transmission (first non, last non) rate is faster than the constant rate .

US6373842B1
CLAIM 7
. A method of delivering unidirectional streaming services (USS) data comprising the steps of : a) determining whether said USS data requires a symmetrical , bidirectional communication link (communication link) with a recipient user ;
and b) responsive to said determining step determining said USS data does not require such symmetrical , bidirectional communication link with a recipient user , introducing a buffer delay between receiving a number of frames and delivering said number of frames by : i) receiving in a buffer USS data transmitted with a variable transmission rate ;
and ii) delivering said USS data with a constant delivery rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US6373842B1
CLAIM 22
. The method as claimed in claim 21 wherein said controlling step comprises i (LP filter) ncreasing the transmission rate if the overall traffic load allows extra bandwidth and decreasing the transmission rate if the overall traffic load is short of bandwidth .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non (said transmission, time t) erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (data frames) , E LPO is an energy of an impulse response of a LP filter of a last non (said transmission, time t) erased frame received before the frame erasure (data frames) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6373842B1
CLAIM 1
. A method of delivering unidirectional streaming services (USS) data transmitted via a wireless network , said method comprising the steps of : a) storing received USS frames in a receive buffer ;
b) testing received USS frames for errors ;
c) replacing , USS frames received with detected errors prior to delivery ;
and d) delivering said USS frames from said buffer at a constant rate responsive to a delay criteria being satisfied ;
wherein said replacing step comprises : i) requesting retransmission of said USS frames received with detected errors ;
ii) replacing the USS frames received with detected errors with retransmitted frames provided the retransmitted frames are received without errors in time t (first non, last non) o be delivered at said constant rate ;
and iii) if said retransmitted frames are not received without errors in time to be delivered at said constant rate , reconstructing the USS frames received with detected errors and discarding any subsequently received retransmission of said frames .

US6373842B1
CLAIM 3
. The method as claimed in claim 2 wherein frames are received at a transmission rate , wherein the average of said transmission (first non, last non) rate is faster than the constant rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US6373842B1
CLAIM 22
. The method as claimed in claim 21 wherein said controlling step comprises i (LP filter) ncreasing the transmission rate if the overall traffic load allows extra bandwidth and decreasing the transmission rate if the overall traffic load is short of bandwidth .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (communication link) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6373842B1
CLAIM 7
. A method of delivering unidirectional streaming services (USS) data comprising the steps of : a) determining whether said USS data requires a symmetrical , bidirectional communication link (communication link) with a recipient user ;
and b) responsive to said determining step determining said USS data does not require such symmetrical , bidirectional communication link with a recipient user , introducing a buffer delay between receiving a number of frames and delivering said number of frames by : i) receiving in a buffer USS data transmitted with a variable transmission rate ;
and ii) delivering said USS data with a constant delivery rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (communication link) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6373842B1
CLAIM 7
. A method of delivering unidirectional streaming services (USS) data comprising the steps of : a) determining whether said USS data requires a symmetrical , bidirectional communication link (communication link) with a recipient user ;
and b) responsive to said determining step determining said USS data does not require such symmetrical , bidirectional communication link with a recipient user , introducing a buffer delay between receiving a number of frames and delivering said number of frames by : i) receiving in a buffer USS data transmitted with a variable transmission rate ;
and ii) delivering said USS data with a constant delivery rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US6373842B1
CLAIM 24
. The method as claimed in claim 13 wherein control signal (maximum amplitude) s are transmitted and received on an associated control channel and further comprising the step of adjusting the output of frames in said buffer responsive to said control signals .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (data frames) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (communication link) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6373842B1
CLAIM 7
. A method of delivering unidirectional streaming services (USS) data comprising the steps of : a) determining whether said USS data requires a symmetrical , bidirectional communication link (communication link) with a recipient user ;
and b) responsive to said determining step determining said USS data does not require such symmetrical , bidirectional communication link with a recipient user , introducing a buffer delay between receiving a number of frames and delivering said number of frames by : i) receiving in a buffer USS data transmitted with a variable transmission rate ;
and ii) delivering said USS data with a constant delivery rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (data frames) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (data frames) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (data frames) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (comprises i) of a first non (said transmission, time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (data frames) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (data frames) , E LPO is an energy of an impulse response of a LP filter of a last non (said transmission, time t) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6373842B1
CLAIM 1
. A method of delivering unidirectional streaming services (USS) data transmitted via a wireless network , said method comprising the steps of : a) storing received USS frames in a receive buffer ;
b) testing received USS frames for errors ;
c) replacing , USS frames received with detected errors prior to delivery ;
and d) delivering said USS frames from said buffer at a constant rate responsive to a delay criteria being satisfied ;
wherein said replacing step comprises : i) requesting retransmission of said USS frames received with detected errors ;
ii) replacing the USS frames received with detected errors with retransmitted frames provided the retransmitted frames are received without errors in time t (first non, last non) o be delivered at said constant rate ;
and iii) if said retransmitted frames are not received without errors in time to be delivered at said constant rate , reconstructing the USS frames received with detected errors and discarding any subsequently received retransmission of said frames .

US6373842B1
CLAIM 3
. The method as claimed in claim 2 wherein frames are received at a transmission rate , wherein the average of said transmission (first non, last non) rate is faster than the constant rate .

US6373842B1
CLAIM 12
. A method of transmitting USS data to a receiver adapted to carry out the method as claimed in claim 2 comprising the steps of : storing USS data frames (last frame, current frame, frame erasure, onset frame, replacement frame, frame concealment) to be transmitted in a transmit buffer ;
transmitting frames stored in said buffer at a transmission rate ;
monitoring said wireless network for retransmission requests from said receiver ;
and responsive to receiving a retransmission request , retransmitting the requested USS data .

US6373842B1
CLAIM 22
. The method as claimed in claim 21 wherein said controlling step comprises i (LP filter) ncreasing the transmission rate if the overall traffic load allows extra bandwidth and decreasing the transmission rate if the overall traffic load is short of bandwidth .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
EP0911807A2

Filed: 1998-10-22     Issued: 1999-04-28

Sound synthesizing method and apparatus, and sound band expanding method and apparatus

(Original Assignee) Sony Corp     (Current Assignee) Sony Corp

Masayuki C/O Sony Corporation Nishiguchi, Shiro c/o Sony Corporation Omori
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (input sound) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse response) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

EP0911807A2
CLAIM 13
The method as set forth in Claim 10 , 11 or 12 , wherein an autocorrelation is used as the characteristic parameter , the autocorrelation is generated from the second coded parameter ;
the autocorrelation is quantized by comparison with a narrow-band correlation determined by convolution between a wide-band autocorrelation in the wide-band sound code books and an autocorrelation of the impulse response (impulse responses, impulse response, LP filter) of a band stop filter ;
and the quantized data is dequantized using the wide-band sound code books to synthesize a sound .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (input sound) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (sound parameters) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

EP0911807A2
CLAIM 33
A sound band expanding method in which , to expand the band of an input narrow-band sound , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound parameters (speech signal) , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : discriminating between a voiced sound and unvoiced sound in the input narrow-band sound at every predetermined time unit ;
generating a voiced parameter and unvoiced parameter from the narrow-band voiced and unvoiced sounds ;
quantizing the narrow-band voiced and unvoiced sound parameters of the narrow-band sound by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and expanding the band of the narrow-band sound based on the dequantized data .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (first one) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP0911807A2
CLAIM 1
A sound synthesizing method in which , to synthesize a sound from plural kinds of input coded parameters , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : decoding the plural kinds of coded parameters ;
forming an innovation from a first one (first non) of the plural kinds of decoded parameters ;
converting a second decoded parameter to a sound synthesis characteristic parameter ;
discriminating between the voiced and unvoiced sounds discriminable with reference to a third decoded parameter ;
quantizing the sound synthesis characteristic parameter based on the result of the discrimination by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and synthesizing a sound based on the dequantized data and innovation .

EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (sound parameters) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (first one) erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
EP0911807A2
CLAIM 1
A sound synthesizing method in which , to synthesize a sound from plural kinds of input coded parameters , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : decoding the plural kinds of coded parameters ;
forming an innovation from a first one (first non) of the plural kinds of decoded parameters ;
converting a second decoded parameter to a sound synthesis characteristic parameter ;
discriminating between the voiced and unvoiced sounds discriminable with reference to a third decoded parameter ;
quantizing the sound synthesis characteristic parameter based on the result of the discrimination by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and synthesizing a sound based on the dequantized data and innovation .

EP0911807A2
CLAIM 33
A sound band expanding method in which , to expand the band of an input narrow-band sound , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound parameters (speech signal) , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : discriminating between a voiced sound and unvoiced sound in the input narrow-band sound at every predetermined time unit ;
generating a voiced parameter and unvoiced parameter from the narrow-band voiced and unvoiced sounds ;
quantizing the narrow-band voiced and unvoiced sound parameters of the narrow-band sound by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and expanding the band of the narrow-band sound based on the dequantized data .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (sound parameters) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (first one) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP0911807A2
CLAIM 1
A sound synthesizing method in which , to synthesize a sound from plural kinds of input coded parameters , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : decoding the plural kinds of coded parameters ;
forming an innovation from a first one (first non) of the plural kinds of decoded parameters ;
converting a second decoded parameter to a sound synthesis characteristic parameter ;
discriminating between the voiced and unvoiced sounds discriminable with reference to a third decoded parameter ;
quantizing the sound synthesis characteristic parameter based on the result of the discrimination by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and synthesizing a sound based on the dequantized data and innovation .

EP0911807A2
CLAIM 33
A sound band expanding method in which , to expand the band of an input narrow-band sound , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound parameters (speech signal) , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : discriminating between a voiced sound and unvoiced sound in the input narrow-band sound at every predetermined time unit ;
generating a voiced parameter and unvoiced parameter from the narrow-band voiced and unvoiced sounds ;
quantizing the narrow-band voiced and unvoiced sound parameters of the narrow-band sound by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and expanding the band of the narrow-band sound based on the dequantized data .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (impulse response) of a first non (first one) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
EP0911807A2
CLAIM 1
A sound synthesizing method in which , to synthesize a sound from plural kinds of input coded parameters , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : decoding the plural kinds of coded parameters ;
forming an innovation from a first one (first non) of the plural kinds of decoded parameters ;
converting a second decoded parameter to a sound synthesis characteristic parameter ;
discriminating between the voiced and unvoiced sounds discriminable with reference to a third decoded parameter ;
quantizing the sound synthesis characteristic parameter based on the result of the discrimination by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and synthesizing a sound based on the dequantized data and innovation .

EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

EP0911807A2
CLAIM 13
The method as set forth in Claim 10 , 11 or 12 , wherein an autocorrelation is used as the characteristic parameter , the autocorrelation is generated from the second coded parameter ;
the autocorrelation is quantized by comparison with a narrow-band correlation determined by convolution between a wide-band autocorrelation in the wide-band sound code books and an autocorrelation of the impulse response (impulse responses, impulse response, LP filter) of a band stop filter ;
and the quantized data is dequantized using the wide-band sound code books to synthesize a sound .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (impulse response) excitation signal produced in the decoder during the received first non (first one) erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse response) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP0911807A2
CLAIM 1
A sound synthesizing method in which , to synthesize a sound from plural kinds of input coded parameters , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : decoding the plural kinds of coded parameters ;
forming an innovation from a first one (first non) of the plural kinds of decoded parameters ;
converting a second decoded parameter to a sound synthesis characteristic parameter ;
discriminating between the voiced and unvoiced sounds discriminable with reference to a third decoded parameter ;
quantizing the sound synthesis characteristic parameter based on the result of the discrimination by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and synthesizing a sound based on the dequantized data and innovation .

EP0911807A2
CLAIM 13
The method as set forth in Claim 10 , 11 or 12 , wherein an autocorrelation is used as the characteristic parameter , the autocorrelation is generated from the second coded parameter ;
the autocorrelation is quantized by comparison with a narrow-band correlation determined by convolution between a wide-band autocorrelation in the wide-band sound code books and an autocorrelation of the impulse response (impulse responses, impulse response, LP filter) of a band stop filter ;
and the quantized data is dequantized using the wide-band sound code books to synthesize a sound .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (input sound) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (impulse response) of a first non (first one) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse response) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP0911807A2
CLAIM 1
A sound synthesizing method in which , to synthesize a sound from plural kinds of input coded parameters , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : decoding the plural kinds of coded parameters ;
forming an innovation from a first one (first non) of the plural kinds of decoded parameters ;
converting a second decoded parameter to a sound synthesis characteristic parameter ;
discriminating between the voiced and unvoiced sounds discriminable with reference to a third decoded parameter ;
quantizing the sound synthesis characteristic parameter based on the result of the discrimination by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and synthesizing a sound based on the dequantized data and innovation .

EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

EP0911807A2
CLAIM 13
The method as set forth in Claim 10 , 11 or 12 , wherein an autocorrelation is used as the characteristic parameter , the autocorrelation is generated from the second coded parameter ;
the autocorrelation is quantized by comparison with a narrow-band correlation determined by convolution between a wide-band autocorrelation in the wide-band sound code books and an autocorrelation of the impulse response (impulse responses, impulse response, LP filter) of a band stop filter ;
and the quantized data is dequantized using the wide-band sound code books to synthesize a sound .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (input sound) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse response) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

EP0911807A2
CLAIM 13
The method as set forth in Claim 10 , 11 or 12 , wherein an autocorrelation is used as the characteristic parameter , the autocorrelation is generated from the second coded parameter ;
the autocorrelation is quantized by comparison with a narrow-band correlation determined by convolution between a wide-band autocorrelation in the wide-band sound code books and an autocorrelation of the impulse response (impulse responses, impulse response, LP filter) of a band stop filter ;
and the quantized data is dequantized using the wide-band sound code books to synthesize a sound .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (input sound) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (sound parameters) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

EP0911807A2
CLAIM 33
A sound band expanding method in which , to expand the band of an input narrow-band sound , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound parameters (speech signal) , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : discriminating between a voiced sound and unvoiced sound in the input narrow-band sound at every predetermined time unit ;
generating a voiced parameter and unvoiced parameter from the narrow-band voiced and unvoiced sounds ;
quantizing the narrow-band voiced and unvoiced sound parameters of the narrow-band sound by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and expanding the band of the narrow-band sound based on the dequantized data .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (first one) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP0911807A2
CLAIM 1
A sound synthesizing method in which , to synthesize a sound from plural kinds of input coded parameters , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : decoding the plural kinds of coded parameters ;
forming an innovation from a first one (first non) of the plural kinds of decoded parameters ;
converting a second decoded parameter to a sound synthesis characteristic parameter ;
discriminating between the voiced and unvoiced sounds discriminable with reference to a third decoded parameter ;
quantizing the sound synthesis characteristic parameter based on the result of the discrimination by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and synthesizing a sound based on the dequantized data and innovation .

EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (sound parameters) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (first one) erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
EP0911807A2
CLAIM 1
A sound synthesizing method in which , to synthesize a sound from plural kinds of input coded parameters , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : decoding the plural kinds of coded parameters ;
forming an innovation from a first one (first non) of the plural kinds of decoded parameters ;
converting a second decoded parameter to a sound synthesis characteristic parameter ;
discriminating between the voiced and unvoiced sounds discriminable with reference to a third decoded parameter ;
quantizing the sound synthesis characteristic parameter based on the result of the discrimination by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and synthesizing a sound based on the dequantized data and innovation .

EP0911807A2
CLAIM 33
A sound band expanding method in which , to expand the band of an input narrow-band sound , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound parameters (speech signal) , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : discriminating between a voiced sound and unvoiced sound in the input narrow-band sound at every predetermined time unit ;
generating a voiced parameter and unvoiced parameter from the narrow-band voiced and unvoiced sounds ;
quantizing the narrow-band voiced and unvoiced sound parameters of the narrow-band sound by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and expanding the band of the narrow-band sound based on the dequantized data .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (sound parameters) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (first one) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP0911807A2
CLAIM 1
A sound synthesizing method in which , to synthesize a sound from plural kinds of input coded parameters , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : decoding the plural kinds of coded parameters ;
forming an innovation from a first one (first non) of the plural kinds of decoded parameters ;
converting a second decoded parameter to a sound synthesis characteristic parameter ;
discriminating between the voiced and unvoiced sounds discriminable with reference to a third decoded parameter ;
quantizing the sound synthesis characteristic parameter based on the result of the discrimination by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and synthesizing a sound based on the dequantized data and innovation .

EP0911807A2
CLAIM 33
A sound band expanding method in which , to expand the band of an input narrow-band sound , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound parameters (speech signal) , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : discriminating between a voiced sound and unvoiced sound in the input narrow-band sound at every predetermined time unit ;
generating a voiced parameter and unvoiced parameter from the narrow-band voiced and unvoiced sounds ;
quantizing the narrow-band voiced and unvoiced sound parameters of the narrow-band sound by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and expanding the band of the narrow-band sound based on the dequantized data .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (impulse response) of a first non (first one) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
EP0911807A2
CLAIM 1
A sound synthesizing method in which , to synthesize a sound from plural kinds of input coded parameters , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : decoding the plural kinds of coded parameters ;
forming an innovation from a first one (first non) of the plural kinds of decoded parameters ;
converting a second decoded parameter to a sound synthesis characteristic parameter ;
discriminating between the voiced and unvoiced sounds discriminable with reference to a third decoded parameter ;
quantizing the sound synthesis characteristic parameter based on the result of the discrimination by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and synthesizing a sound based on the dequantized data and innovation .

EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

EP0911807A2
CLAIM 13
The method as set forth in Claim 10 , 11 or 12 , wherein an autocorrelation is used as the characteristic parameter , the autocorrelation is generated from the second coded parameter ;
the autocorrelation is quantized by comparison with a narrow-band correlation determined by convolution between a wide-band autocorrelation in the wide-band sound code books and an autocorrelation of the impulse response (impulse responses, impulse response, LP filter) of a band stop filter ;
and the quantized data is dequantized using the wide-band sound code books to synthesize a sound .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (impulse response) excitation signal produced in the decoder during the received first non (first one) erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse response) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP0911807A2
CLAIM 1
A sound synthesizing method in which , to synthesize a sound from plural kinds of input coded parameters , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : decoding the plural kinds of coded parameters ;
forming an innovation from a first one (first non) of the plural kinds of decoded parameters ;
converting a second decoded parameter to a sound synthesis characteristic parameter ;
discriminating between the voiced and unvoiced sounds discriminable with reference to a third decoded parameter ;
quantizing the sound synthesis characteristic parameter based on the result of the discrimination by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and synthesizing a sound based on the dequantized data and innovation .

EP0911807A2
CLAIM 13
The method as set forth in Claim 10 , 11 or 12 , wherein an autocorrelation is used as the characteristic parameter , the autocorrelation is generated from the second coded parameter ;
the autocorrelation is quantized by comparison with a narrow-band correlation determined by convolution between a wide-band autocorrelation in the wide-band sound code books and an autocorrelation of the impulse response (impulse responses, impulse response, LP filter) of a band stop filter ;
and the quantized data is dequantized using the wide-band sound code books to synthesize a sound .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (input sound) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (sound parameters) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

EP0911807A2
CLAIM 33
A sound band expanding method in which , to expand the band of an input narrow-band sound , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound parameters (speech signal) , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : discriminating between a voiced sound and unvoiced sound in the input narrow-band sound at every predetermined time unit ;
generating a voiced parameter and unvoiced parameter from the narrow-band voiced and unvoiced sounds ;
quantizing the narrow-band voiced and unvoiced sound parameters of the narrow-band sound by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and expanding the band of the narrow-band sound based on the dequantized data .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (impulse response) of a first non (first one) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse response) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
EP0911807A2
CLAIM 1
A sound synthesizing method in which , to synthesize a sound from plural kinds of input coded parameters , there are used a wide-band voiced sound code book and a wide-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters , respectively , extracted from wide-band voiced and unvoiced sounds separated at every predetermined time unit , and a narrow-band voiced sound code book and a narrow-band unvoiced sound code book pre-formed from voiced and unvoiced sound characteristic parameters extracted from a narrow-band sound obtained by limiting the frequency band of the separated wide-band voiced and unvoiced sounds , comprising the steps of : decoding the plural kinds of coded parameters ;
forming an innovation from a first one (first non) of the plural kinds of decoded parameters ;
converting a second decoded parameter to a sound synthesis characteristic parameter ;
discriminating between the voiced and unvoiced sounds discriminable with reference to a third decoded parameter ;
quantizing the sound synthesis characteristic parameter based on the result of the discrimination by using the narrow-band voiced and unvoiced sound code books ;
dequantizing , by using the wide-band voiced and unvoiced sound code books , the narrow-band voiced and unvoiced sound data having been quantized using the narrow-band voiced and unvoiced sound code books ;
and synthesizing a sound based on the dequantized data and innovation .

EP0911807A2
CLAIM 4
The method as set forth in Claim 3 , further comprising the step of : extracting parameters from an input sound (pitch period, signal classification parameter) , except for a one in which no positive discrimination is possible between voiced and unvoiced sounds , for forming the wide-band voiced and unvoiced sound code books and narrow-band voiced and unvoiced sound code books .

EP0911807A2
CLAIM 13
The method as set forth in Claim 10 , 11 or 12 , wherein an autocorrelation is used as the characteristic parameter , the autocorrelation is generated from the second coded parameter ;
the autocorrelation is quantized by comparison with a narrow-band correlation determined by convolution between a wide-band autocorrelation in the wide-band sound code books and an autocorrelation of the impulse response (impulse responses, impulse response, LP filter) of a band stop filter ;
and the quantized data is dequantized using the wide-band sound code books to synthesize a sound .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JP2000132194A

Filed: 1998-10-22     Issued: 2000-05-12

信号符号化装置及び方法、並びに信号復号装置及び方法

(Original Assignee) Sony Corp; ソニー株式会社     

Kenichi Makino, Atsushi Matsumoto, 淳 松本, 堅一 牧野
US7693710B2
CLAIM 1
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 2
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 3
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2000132194A
CLAIM 1
【請求項1】 時間軸上の入力信号に対して直交変換手 段により直交変換を施して符号化を行う信号符号化装置 において、 上記入力信号に応じて重みを算出する重み計算手段と、 上記直交変換手段からの係数データを上記重み計算手段 からの重みの順に従って順位をつけ、この順位の上位側 ほど精度の高い量子化を行う量子化手段とを有すること (maximum amplitude) を特徴とする信号符号化装置。

JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 4
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 5
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (線形予測符号化) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (線形予測符号化) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 8
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (線形予測符号化) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 10
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 11
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JP2000132194A
CLAIM 1
【請求項1】 時間軸上の入力信号に対して直交変換手 段により直交変換を施して符号化を行う信号符号化装置 において、 上記入力信号に応じて重みを算出する重み計算手段と、 上記直交変換手段からの係数データを上記重み計算手段 からの重みの順に従って順位をつけ、この順位の上位側 ほど精度の高い量子化を行う量子化手段とを有すること (maximum amplitude) を特徴とする信号符号化装置。

JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (線形予測符号化) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2000132194A
CLAIM 1
【請求項1】 時間軸上の入力信号に対して直交変換手 段により直交変換を施して符号化を行う信号符号化装置 において、 上記入力信号に応じて重みを算出する重み計算手段と、 上記直交変換手段からの係数データを上記重み計算手段 からの重みの順に従って順位をつけ、この順位の上位側 ほど精度の高い量子化を行う量子化手段とを有すること (maximum amplitude) を特徴とする信号符号化装置。

JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (線形予測符号化) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (線形予測符号化) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (線形予測符号化) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JP2000132194A
CLAIM 1
【請求項1】 時間軸上の入力信号に対して直交変換手 段により直交変換を施して符号化を行う信号符号化装置 において、 上記入力信号に応じて重みを算出する重み計算手段と、 上記直交変換手段からの係数データを上記重み計算手段 からの重みの順に従って順位をつけ、この順位の上位側 ほど精度の高い量子化を行う量子化手段とを有すること (maximum amplitude) を特徴とする信号符号化装置。

JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (線形予測符号化) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JP2000132194A
CLAIM 7
【請求項7】 上記直交変換手段の入力側に、上記入力 信号の信号波形の相関を除去して残差を取り出す正規化 手段を設け、 この正規化手段は、上記入力信号を線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) (L PC)分析して得られたLPC係数に基づき上記入力信 号のLPC予測残差を出力するLPC逆フィルタと、上 記LPC予測残差をピッチ分析して得られたピッチパラ メータに基づき上記LPC予測残差のピッチの相関性を 除去するピッチ逆フィルタとを有して成り、 上記重み計算手段は、上記LPC係数及び上記ピッチパ ラメータに基づいて重みを計算することを特徴とする請 求項1記載の信号符号化装置。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6289297B1

Filed: 1998-10-09     Issued: 2001-09-11

Method for reconstructing a video frame received from a video source over a communication channel

(Original Assignee) Microsoft Corp     (Current Assignee) Microsoft Technology Licensing LLC

Paramvir Bahl
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (transmission error) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US6289297B1
CLAIM 6
. A computer-readable medium having computer executable instructions for use in reconstructing a video image , the instructions performing steps c (LP filter) omprising : receiving from a video source a first plurality of transmissions over a communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame ;
determining if a transmission error is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (steps c) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6289297B1
CLAIM 6
. A computer-readable medium having computer executable instructions for use in reconstructing a video image , the instructions performing steps c (LP filter) omprising : receiving from a video source a first plurality of transmissions over a communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame ;
determining if a transmission error is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US6289297B1
CLAIM 6
. A computer-readable medium having computer executable instructions for use in reconstructing a video image , the instructions performing steps c (LP filter) omprising : receiving from a video source a first plurality of transmissions over a communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame ;
determining if a transmission error is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (transmission error) , limits to a given value a gain used for scaling the synthesized sound signal .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US6289297B1
CLAIM 6
. A computer-readable medium having computer executable instructions for use in reconstructing a video image , the instructions performing steps c (LP filter) omprising : receiving from a video source a first plurality of transmissions over a communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame ;
determining if a transmission error is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (steps c) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6289297B1
CLAIM 6
. A computer-readable medium having computer executable instructions for use in reconstructing a video image , the instructions performing steps c (LP filter) omprising : receiving from a video source a first plurality of transmissions over a communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame ;
determining if a transmission error is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (transmission error) and decoder recovery (transmission error) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6289297B1
CLAIM 1
. A method for reconstructing a video frame received from a video source over a communication network , the method comprising the steps of : receiving from the video source a first plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame : determining if a transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .

US6289297B1
CLAIM 6
. A computer-readable medium having computer executable instructions for use in reconstructing a video image , the instructions performing steps c (LP filter) omprising : receiving from a video source a first plurality of transmissions over a communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a first video frame ;
receiving from the video source a second plurality of transmissions over the communications network wherein each transmission contains information representative of a discrete spatial component corresponding to a particular region of a second video frame ;
determining if a transmission error is associated with any of the second plurality of transmissions ;
substituting for each of the second plurality of transmissions that is determined to have an associated error the information contained in one of the first plurality of transmissions to thereby form a reconstructed second video frame ;
and requesting that the video source give priority to a particular transmission in a third plurality of transmissions , wherein each transmission in the third plurality of transmissions contains information representative of a discrete spatial component corresponding to a particular region of a third video frame , wherein the requested priority is a function of the determined errors associated with the second plurality of transmissions .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6202045B1

Filed: 1998-09-30     Issued: 2001-03-13

Speech coding with variable model order linear prediction

(Original Assignee) Nokia Mobile Phones Ltd     (Current Assignee) Provenance Asset Group LLC ; Nokia USA Inc

Pasi Ojala, Ari Lakaniemi, Vesa T. Ruoppila
US7693710B2
CLAIM 1
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6202045B1
CLAIM 1
. A method of coding a sampled speech signal (speech signal, decoder determines concealment) , the method comprising dividing the speech signal into sequential frames and , for each current frame : generating a first set of linear prediction coding (LPC) coefficients which correspond to the coefficients of a linear filter and which are representative of short term redundancy in the current frame ;
if the number of LPC coefficients in the first set of the current frame differs from the number in the first set of the preceding frame , then generating a second expanded or contracted set of LPC coefficients from the first set of LPC coefficients generated for the preceding frame , the second set containing a number of LPC coefficients equal to the number of LPC coefficients in said first set of the current frame ;
and encoding the current frame using the first set of LPC coefficients of the current frame and the second set of LPC coefficients of the preceding frame .

US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (PC mode) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6202045B1
CLAIM 1
. A method of coding a sampled speech signal (speech signal, decoder determines concealment) , the method comprising dividing the speech signal into sequential frames and , for each current frame : generating a first set of linear prediction coding (LPC) coefficients which correspond to the coefficients of a linear filter and which are representative of short term redundancy in the current frame ;
if the number of LPC coefficients in the first set of the current frame differs from the number in the first set of the preceding frame , then generating a second expanded or contracted set of LPC coefficients from the first set of LPC coefficients generated for the preceding frame , the second set containing a number of LPC coefficients equal to the number of LPC coefficients in said first set of the current frame ;
and encoding the current frame using the first set of LPC coefficients of the current frame and the second set of LPC coefficients of the preceding frame .

US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (PC mode) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6202045B1
CLAIM 1
. A method of coding a sampled speech signal (speech signal, decoder determines concealment) , the method comprising dividing the speech signal into sequential frames and , for each current frame : generating a first set of linear prediction coding (LPC) coefficients which correspond to the coefficients of a linear filter and which are representative of short term redundancy in the current frame ;
if the number of LPC coefficients in the first set of the current frame differs from the number in the first set of the preceding frame , then generating a second expanded or contracted set of LPC coefficients from the first set of LPC coefficients generated for the preceding frame , the second set containing a number of LPC coefficients equal to the number of LPC coefficients in said first set of the current frame ;
and encoding the current frame using the first set of LPC coefficients of the current frame and the second set of LPC coefficients of the preceding frame .

US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (PC mode) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (PC mode) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6202045B1
CLAIM 1
. A method of coding a sampled speech signal (speech signal, decoder determines concealment) , the method comprising dividing the speech signal into sequential frames and , for each current frame : generating a first set of linear prediction coding (LPC) coefficients which correspond to the coefficients of a linear filter and which are representative of short term redundancy in the current frame ;
if the number of LPC coefficients in the first set of the current frame differs from the number in the first set of the preceding frame , then generating a second expanded or contracted set of LPC coefficients from the first set of LPC coefficients generated for the preceding frame , the second set containing a number of LPC coefficients equal to the number of LPC coefficients in said first set of the current frame ;
and encoding the current frame using the first set of LPC coefficients of the current frame and the second set of LPC coefficients of the preceding frame .

US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (PC mode) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6202045B1
CLAIM 1
. A method of coding a sampled speech signal (speech signal, decoder determines concealment) , the method comprising dividing the speech signal into sequential frames and , for each current frame : generating a first set of linear prediction coding (LPC) coefficients which correspond to the coefficients of a linear filter and which are representative of short term redundancy in the current frame ;
if the number of LPC coefficients in the first set of the current frame differs from the number in the first set of the preceding frame , then generating a second expanded or contracted set of LPC coefficients from the first set of LPC coefficients generated for the preceding frame , the second set containing a number of LPC coefficients equal to the number of LPC coefficients in said first set of the current frame ;
and encoding the current frame using the first set of LPC coefficients of the current frame and the second set of LPC coefficients of the preceding frame .

US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (PC mode) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6202045B1
CLAIM 1
. A method of coding a sampled speech signal (speech signal, decoder determines concealment) , the method comprising dividing the speech signal into sequential frames and , for each current frame : generating a first set of linear prediction coding (LPC) coefficients which correspond to the coefficients of a linear filter and which are representative of short term redundancy in the current frame ;
if the number of LPC coefficients in the first set of the current frame differs from the number in the first set of the preceding frame , then generating a second expanded or contracted set of LPC coefficients from the first set of LPC coefficients generated for the preceding frame , the second set containing a number of LPC coefficients equal to the number of LPC coefficients in said first set of the current frame ;
and encoding the current frame using the first set of LPC coefficients of the current frame and the second set of LPC coefficients of the preceding frame .

US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (PC mode) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6202045B1
CLAIM 1
. A method of coding a sampled speech signal (speech signal, decoder determines concealment) , the method comprising dividing the speech signal into sequential frames and , for each current frame : generating a first set of linear prediction coding (LPC) coefficients which correspond to the coefficients of a linear filter and which are representative of short term redundancy in the current frame ;
if the number of LPC coefficients in the first set of the current frame differs from the number in the first set of the preceding frame , then generating a second expanded or contracted set of LPC coefficients from the first set of LPC coefficients generated for the preceding frame , the second set containing a number of LPC coefficients equal to the number of LPC coefficients in said first set of the current frame ;
and encoding the current frame using the first set of LPC coefficients of the current frame and the second set of LPC coefficients of the preceding frame .

US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (PC mode) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6202045B1
CLAIM 14
. A method according to claim 13 , wherein a set or sets of expanded or contracted LPC a coefficient of the preceding frame , corresponding to each available LPC mode (frame erasure) l order , is generated .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6385573B1

Filed: 1998-09-18     Issued: 2002-05-07

Adaptive tilt compensation for synthesized speech residual

(Original Assignee) Lakestar Semi Inc     (Current Assignee) Samsung Electronics Co Ltd

Yang Gao, Huan-Yu Su
US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6385573B1
CLAIM 1
. A speech system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech system comprising : at least one codebook containing at least one code vector ;
processing circuitry that generates a synthesized residual signal using the at least one codebook ;
and the processing circuitry applying adaptive tilt compensation to the synthesized residual signal based in part on an encoding bit rate of the speech system and a flatness of the synthesized residual signal .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6385573B1
CLAIM 1
. A speech system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech system comprising : at least one codebook containing at least one code vector ;
processing circuitry that generates a synthesized residual signal using the at least one codebook ;
and the processing circuitry applying adaptive tilt compensation to the synthesized residual signal based in part on an encoding bit rate of the speech system and a flatness of the synthesized residual signal .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6385573B1
CLAIM 1
. A speech system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech system comprising : at least one codebook containing at least one code vector ;
processing circuitry that generates a synthesized residual signal using the at least one codebook ;
and the processing circuitry applying adaptive tilt compensation to the synthesized residual signal based in part on an encoding bit rate of the speech system and a flatness of the synthesized residual signal .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6385573B1
CLAIM 4
. The speech system of claim 1 wherein the adaptive tilt compensation comprises i (LP filter) dentifying a filter coefficient for use in a compensating filter .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6385573B1
CLAIM 4
. The speech system of claim 1 wherein the adaptive tilt compensation comprises i (LP filter) dentifying a filter coefficient for use in a compensating filter .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6385573B1
CLAIM 4
. The speech system of claim 1 wherein the adaptive tilt compensation comprises i (LP filter) dentifying a filter coefficient for use in a compensating filter .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6385573B1
CLAIM 1
. A speech system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech system comprising : at least one codebook containing at least one code vector ;
processing circuitry that generates a synthesized residual signal using the at least one codebook ;
and the processing circuitry applying adaptive tilt compensation to the synthesized residual signal based in part on an encoding bit rate of the speech system and a flatness of the synthesized residual signal .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6385573B1
CLAIM 1
. A speech system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech system comprising : at least one codebook containing at least one code vector ;
processing circuitry that generates a synthesized residual signal using the at least one codebook ;
and the processing circuitry applying adaptive tilt compensation to the synthesized residual signal based in part on an encoding bit rate of the speech system and a flatness of the synthesized residual signal .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6385573B1
CLAIM 1
. A speech system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech system comprising : at least one codebook containing at least one code vector ;
processing circuitry that generates a synthesized residual signal using the at least one codebook ;
and the processing circuitry applying adaptive tilt compensation to the synthesized residual signal based in part on an encoding bit rate of the speech system and a flatness of the synthesized residual signal .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6385573B1
CLAIM 4
. The speech system of claim 1 wherein the adaptive tilt compensation comprises i (LP filter) dentifying a filter coefficient for use in a compensating filter .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6385573B1
CLAIM 4
. The speech system of claim 1 wherein the adaptive tilt compensation comprises i (LP filter) dentifying a filter coefficient for use in a compensating filter .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6385573B1
CLAIM 1
. A speech system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech system comprising : at least one codebook containing at least one code vector ;
processing circuitry that generates a synthesized residual signal using the at least one codebook ;
and the processing circuitry applying adaptive tilt compensation to the synthesized residual signal based in part on an encoding bit rate of the speech system and a flatness of the synthesized residual signal .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6385573B1
CLAIM 4
. The speech system of claim 1 wherein the adaptive tilt compensation comprises i (LP filter) dentifying a filter coefficient for use in a compensating filter .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US20010023395A1

Filed: 1998-09-18     Issued: 2001-09-20

Speech encoder adaptively applying pitch preprocessing with warping of target signal

(Original Assignee) Lakestar Semi Inc     (Current Assignee) Samsung Electronics Co Ltd

Huan-Yu Su, Yang Gao
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder (last frame, replacement frame) using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder (last frame, replacement frame) using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch p (E q) reprocessing that employs continuous warping .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook, speech signal) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch p (E q) reprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder (last frame, replacement frame) using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder (last frame, replacement frame) using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder (last frame, replacement frame) using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch p (E q) reprocessing that employs continuous warping .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch preprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook, speech signal) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US20010023395A1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (sound signal, speech signal, decoder determines concealment) having varying characteristics , the speech encoding system comprising : an encoder processing circuit that adaptively selects a first encoding scheme or a second encoding scheme ;
and the first encoding scheme comprises pitch p (E q) reprocessing that employs continuous warping .

US20010023395A1
CLAIM 8
. A speech encoder (last frame, replacement frame) using an analysis by synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode ;
the first long term prediction mode comprises pitch preprocessing ;
and an adaptive codebook (sound signal, speech signal, decoder determines concealment) coupled to the encoder to the encoder processing circuit .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6260010B1

Filed: 1998-09-18     Issued: 2001-07-10

Speech encoder using gain normalization that combines open and closed loop gains

(Original Assignee) Lakestar Semi Inc     (Current Assignee) MACOM Technology Solutions Holdings Inc

Yang Gao, Jes Thyssen, Adil Benyassine
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (maximum limit) from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US6260010B1
CLAIM 14
. The speech encoding system of claim 9 wherein the encoder processing circuit applies a maximum limit (average pitch value) in gain normalization processing .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (fixed codebook) per sample for other frames .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook (average energy) .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US6260010B1
CLAIM 7
. The speech encoding system of claim 1 wherein the encoder processing circuit sets the gain normalization factor to the open loop gain normalization factor when the speech signal does not constitute background noise (LP filter) and a linear predictive coding gain is within a predetermined range .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (background noise) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6260010B1
CLAIM 7
. The speech encoding system of claim 1 wherein the encoder processing circuit sets the gain normalization factor to the open loop gain normalization factor when the speech signal does not constitute background noise (LP filter) and a linear predictive coding gain is within a predetermined range .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US6260010B1
CLAIM 7
. The speech encoding system of claim 1 wherein the encoder processing circuit sets the gain normalization factor to the open loop gain normalization factor when the speech signal does not constitute background noise (LP filter) and a linear predictive coding gain is within a predetermined range .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (maximum limit) from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US6260010B1
CLAIM 14
. The speech encoding system of claim 9 wherein the encoder processing circuit applies a maximum limit (average pitch value) in gain normalization processing .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook (average energy) .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US6260010B1
CLAIM 7
. The speech encoding system of claim 1 wherein the encoder processing circuit sets the gain normalization factor to the open loop gain normalization factor when the speech signal does not constitute background noise (LP filter) and a linear predictive coding gain is within a predetermined range .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (background noise) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6260010B1
CLAIM 7
. The speech encoding system of claim 1 wherein the encoder processing circuit sets the gain normalization factor to the open loop gain normalization factor when the speech signal does not constitute background noise (LP filter) and a linear predictive coding gain is within a predetermined range .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook (average energy) .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6260010B1
CLAIM 2
. The speech encoding system of claim 1 wherein the excitation vectors are determined from a plurality of codebooks comprising an adaptive codebook (sound signal, speech signal) and a fixed codebook .

US6260010B1
CLAIM 7
. The speech encoding system of claim 1 wherein the encoder processing circuit sets the gain normalization factor to the open loop gain normalization factor when the speech signal does not constitute background noise (LP filter) and a linear predictive coding gain is within a predetermined range .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6330533B2

Filed: 1998-09-18     Issued: 2001-12-11

Speech encoder adaptively applying pitch preprocessing with warping of target signal

(Original Assignee) Lakestar Semi Inc     (Current Assignee) Samsung Electronics Co Ltd

Huan-Yu Su, Yang Gao
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US6330533B2
CLAIM 12
. A speech encoder (last frame, replacement frame) using an analysis-by-synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode based , at least in part , on a selected bit rate for the speech signal ;
the first long term prediction mode comprising pitch preprocessing that employs warping by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region ;
and an adaptive codebook coupled to the encoder processing circuit , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified in the modified time region to conform to target interpolated pitch values prior to selecting a contribution of the adaptive codebook for the interval .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US6330533B2
CLAIM 12
. A speech encoder (last frame, replacement frame) using an analysis-by-synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode based , at least in part , on a selected bit rate for the speech signal ;
the first long term prediction mode comprising pitch preprocessing that employs warping by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region ;
and an adaptive codebook coupled to the encoder processing circuit , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified in the modified time region to conform to target interpolated pitch values prior to selecting a contribution of the adaptive codebook for the interval .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch p (E q) reprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch p (E q) reprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US6330533B2
CLAIM 12
. A speech encoder (last frame, replacement frame) using an analysis-by-synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode based , at least in part , on a selected bit rate for the speech signal ;
the first long term prediction mode comprising pitch preprocessing that employs warping by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region ;
and an adaptive codebook coupled to the encoder processing circuit , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified in the modified time region to conform to target interpolated pitch values prior to selecting a contribution of the adaptive codebook for the interval .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US6330533B2
CLAIM 12
. A speech encoder (last frame, replacement frame) using an analysis-by-synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode based , at least in part , on a selected bit rate for the speech signal ;
the first long term prediction mode comprising pitch preprocessing that employs warping by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region ;
and an adaptive codebook coupled to the encoder processing circuit , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified in the modified time region to conform to target interpolated pitch values prior to selecting a contribution of the adaptive codebook for the interval .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US6330533B2
CLAIM 12
. A speech encoder (last frame, replacement frame) using an analysis-by-synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode based , at least in part , on a selected bit rate for the speech signal ;
the first long term prediction mode comprising pitch preprocessing that employs warping by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region ;
and an adaptive codebook coupled to the encoder processing circuit , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified in the modified time region to conform to target interpolated pitch values prior to selecting a contribution of the adaptive codebook for the interval .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch p (E q) reprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch preprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6330533B2
CLAIM 1
. A speech encoding system for encoding a speech signal , the system comprising : an encoder processing circuit for adaptively selecting a first encoding scheme or a second encoding scheme ;
an adaptive codebook (sound signal, speech signal) containing excitation vectors representative of at least a portion of the speech signal consistent with the first encoding scheme ;
and a pitch p (E q) reprocessing module associated with the first encoding scheme and applying warping to the speech signal by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified from the original time region to the modified time region to conform to target interpolated pitch values prior to selecting a preferential one of the excitation vectors of the adaptive codebook for the interval .

US6330533B2
CLAIM 12
. A speech encoder (last frame, replacement frame) using an analysis-by-synthesis approach on a speech signal having varying characteristics , the speech encoder comprising : an encoder processing circuit that adaptively selects a first long term prediction mode or a second long term prediction mode based , at least in part , on a selected bit rate for the speech signal ;
the first long term prediction mode comprising pitch preprocessing that employs warping by deforming a weighted speech signal , derived from the speech signal , from an original time region to a modified time region ;
and an adaptive codebook coupled to the encoder processing circuit , where pursuant to the deforming , at least an interval of the weighted speech signal of the original time region is temporally modified in the modified time region to conform to target interpolated pitch values prior to selecting a contribution of the adaptive codebook for the interval .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6104992A

Filed: 1998-09-18     Issued: 2000-08-15

Adaptive gain reduction to produce fixed codebook target signal

(Original Assignee) Lakestar Semi Inc     (Current Assignee) Hanger Solutions LLC

Yang Gao, Huan-Yu Su
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (filtered signal) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6104992A
CLAIM 6
. The speech system of claim 4 wherein the processing circuit calculates the correlation value based , at least in part , on a filtered signal (first impulse) from the adaptive codebook .

US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (fixed codebook) per sample for other frames .
US6104992A
CLAIM 1
. A speech system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech system comprising : an adaptive codebook ;
a fixed codebook (average energy) ;
a processing circuit that sequentially identifies a first gain applied to the adaptive codebook and a second gain applied to the fixed codebook ;
and the processing circuit identifies a gain reduction factor applied to the first gain identified , the gain reduction factor is used by the processing circuit to perform the identification of the second gain .

US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6104992A
CLAIM 1
. A speech system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that sequentially identifies a first gain applied to the adaptive codebook and a second gain applied to the fixed codebook ;
and the processing circuit identifies a gain reduction factor applied to the first gain identified , the gain reduction factor is used by the processing circuit to perform the identification of the second gain .

US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6104992A
CLAIM 1
. A speech system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that sequentially identifies a first gain applied to the adaptive codebook and a second gain applied to the fixed codebook ;
and the processing circuit identifies a gain reduction factor applied to the first gain identified , the gain reduction factor is used by the processing circuit to perform the identification of the second gain .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (filtered signal) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6104992A
CLAIM 6
. The speech system of claim 4 wherein the processing circuit calculates the correlation value based , at least in part , on a filtered signal (first impulse) from the adaptive codebook .

US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US6104992A
CLAIM 1
. A speech system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech system comprising : an adaptive codebook ;
a fixed codebook (average energy) ;
a processing circuit that sequentially identifies a first gain applied to the adaptive codebook and a second gain applied to the fixed codebook ;
and the processing circuit identifies a gain reduction factor applied to the first gain identified , the gain reduction factor is used by the processing circuit to perform the identification of the second gain .

US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) , limits to a given value a gain used for scaling the synthesized sound signal .
US6104992A
CLAIM 1
. A speech system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that sequentially identifies a first gain applied to the adaptive codebook and a second gain applied to the fixed codebook ;
and the processing circuit identifies a gain reduction factor applied to the first gain identified , the gain reduction factor is used by the processing circuit to perform the identification of the second gain .

US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6104992A
CLAIM 1
. A speech system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that sequentially identifies a first gain applied to the adaptive codebook and a second gain applied to the fixed codebook ;
and the processing circuit identifies a gain reduction factor applied to the first gain identified , the gain reduction factor is used by the processing circuit to perform the identification of the second gain .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US6104992A
CLAIM 1
. A speech system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech system comprising : an adaptive codebook ;
a fixed codebook (average energy) ;
a processing circuit that sequentially identifies a first gain applied to the adaptive codebook and a second gain applied to the fixed codebook ;
and the processing circuit identifies a gain reduction factor applied to the first gain identified , the gain reduction factor is used by the processing circuit to perform the identification of the second gain .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery (second residual signal, first residual signal) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6104992A
CLAIM 13
. A speech system using an analysis by synthesis approach on a speech signal , the speech system comprising : an adaptive codebook ;
a fixed codebook ;
a processing circuit that attempts to minimize a first residual signal (decoder concealment, decoder recovery) using contributions from both the adaptive codebook and the fixed codebook ;
and the processing circuit , after attempting to minimize the first residual signal , applying gain reduction to the contribution from the adaptive codebook and then recalculating the contribution from the fixed codebook by attempting to minimize a second residual signal (decoder concealment, decoder recovery) .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6188980B1

Filed: 1998-09-18     Issued: 2001-02-13

Synchronized encoder-decoder frame concealment using speech coding parameters including line spectral frequencies and filter coefficients

(Original Assignee) Lakestar Semi Inc     (Current Assignee) Samsung Electronics Co Ltd

Jes Thyssen
US7693710B2
CLAIM 1
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6188980B1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech encoding system comprising : an encoder that generates a series of line spectral frequencies , some of the line spectral frequencies are produced out of order ;
the encoder determines the number of line spectral frequencies that are produced out of order ;
the encoder selectively applies a first of a plurality of correction techniques to process the series of line spectral frequencies that are produced out of order if the number of line spectral frequencies that is produced out of order exceeds a first predetermined threshold ;
the encoder selectively applies a second of the plurality of correction techniques to process the series of line spectral frequencies that are produced out of order if the number of line spectral frequencies that is produced out of order exceeds a second predetermined threshold ;
and the second of the plurality of correction techniques comprises reordering the series of line spectral frequencies that are produced out of order .

US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US6188980B1
CLAIM 16
. A method used by a speech encoder (last frame, replacement frame) that operates on a speech signal , the method comprising : producing from the speech signal a series of line spectral frequencies , at least one of the line spectral frequencies being produced out of order ;
determining the number of line spectral frequencies that are produced out of order ;
and deciding either to reorder the series of line spectral frequencies that are produced out of order , or to replace at least a portion of the series of line spectral frequencies that are produced out of order using at least a portion of one previous series of line spectral frequencies if the number of line spectral frequencies that is produced out of order exceeds a predetermined threshold .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (frame erasure) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6188980B1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech encoding system comprising : an encoder that generates a series of line spectral frequencies , some of the line spectral frequencies are produced out of order ;
the encoder determines the number of line spectral frequencies that are produced out of order ;
the encoder selectively applies a first of a plurality of correction techniques to process the series of line spectral frequencies that are produced out of order if the number of line spectral frequencies that is produced out of order exceeds a first predetermined threshold ;
the encoder selectively applies a second of the plurality of correction techniques to process the series of line spectral frequencies that are produced out of order if the number of line spectral frequencies that is produced out of order exceeds a second predetermined threshold ;
and the second of the plurality of correction techniques comprises reordering the series of line spectral frequencies that are produced out of order .

US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (frame erasure) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6188980B1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech encoding system comprising : an encoder that generates a series of line spectral frequencies , some of the line spectral frequencies are produced out of order ;
the encoder determines the number of line spectral frequencies that are produced out of order ;
the encoder selectively applies a first of a plurality of correction techniques to process the series of line spectral frequencies that are produced out of order if the number of line spectral frequencies that is produced out of order exceeds a first predetermined threshold ;
the encoder selectively applies a second of the plurality of correction techniques to process the series of line spectral frequencies that are produced out of order if the number of line spectral frequencies that is produced out of order exceeds a second predetermined threshold ;
and the second of the plurality of correction techniques comprises reordering the series of line spectral frequencies that are produced out of order .

US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US6188980B1
CLAIM 16
. A method used by a speech encoder (last frame, replacement frame) that operates on a speech signal , the method comprising : producing from the speech signal a series of line spectral frequencies , at least one of the line spectral frequencies being produced out of order ;
determining the number of line spectral frequencies that are produced out of order ;
and deciding either to reorder the series of line spectral frequencies that are produced out of order , or to replace at least a portion of the series of line spectral frequencies that are produced out of order using at least a portion of one previous series of line spectral frequencies if the number of line spectral frequencies that is produced out of order exceeds a predetermined threshold .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (frame erasure) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (frame erasure) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US6188980B1
CLAIM 16
. A method used by a speech encoder (last frame, replacement frame) that operates on a speech signal , the method comprising : producing from the speech signal a series of line spectral frequencies , at least one of the line spectral frequencies being produced out of order ;
determining the number of line spectral frequencies that are produced out of order ;
and deciding either to reorder the series of line spectral frequencies that are produced out of order , or to replace at least a portion of the series of line spectral frequencies that are produced out of order using at least a portion of one previous series of line spectral frequencies if the number of line spectral frequencies that is produced out of order exceeds a predetermined threshold .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6188980B1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech encoding system comprising : an encoder that generates a series of line spectral frequencies , some of the line spectral frequencies are produced out of order ;
the encoder determines the number of line spectral frequencies that are produced out of order ;
the encoder selectively applies a first of a plurality of correction techniques to process the series of line spectral frequencies that are produced out of order if the number of line spectral frequencies that is produced out of order exceeds a first predetermined threshold ;
the encoder selectively applies a second of the plurality of correction techniques to process the series of line spectral frequencies that are produced out of order if the number of line spectral frequencies that is produced out of order exceeds a second predetermined threshold ;
and the second of the plurality of correction techniques comprises reordering the series of line spectral frequencies that are produced out of order .

US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US6188980B1
CLAIM 16
. A method used by a speech encoder (last frame, replacement frame) that operates on a speech signal , the method comprising : producing from the speech signal a series of line spectral frequencies , at least one of the line spectral frequencies being produced out of order ;
determining the number of line spectral frequencies that are produced out of order ;
and deciding either to reorder the series of line spectral frequencies that are produced out of order , or to replace at least a portion of the series of line spectral frequencies that are produced out of order using at least a portion of one previous series of line spectral frequencies if the number of line spectral frequencies that is produced out of order exceeds a predetermined threshold .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (frame erasure) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6188980B1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech encoding system comprising : an encoder that generates a series of line spectral frequencies , some of the line spectral frequencies are produced out of order ;
the encoder determines the number of line spectral frequencies that are produced out of order ;
the encoder selectively applies a first of a plurality of correction techniques to process the series of line spectral frequencies that are produced out of order if the number of line spectral frequencies that is produced out of order exceeds a first predetermined threshold ;
the encoder selectively applies a second of the plurality of correction techniques to process the series of line spectral frequencies that are produced out of order if the number of line spectral frequencies that is produced out of order exceeds a second predetermined threshold ;
and the second of the plurality of correction techniques comprises reordering the series of line spectral frequencies that are produced out of order .

US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (frame erasure) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6188980B1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech encoding system comprising : an encoder that generates a series of line spectral frequencies , some of the line spectral frequencies are produced out of order ;
the encoder determines the number of line spectral frequencies that are produced out of order ;
the encoder selectively applies a first of a plurality of correction techniques to process the series of line spectral frequencies that are produced out of order if the number of line spectral frequencies that is produced out of order exceeds a first predetermined threshold ;
the encoder selectively applies a second of the plurality of correction techniques to process the series of line spectral frequencies that are produced out of order if the number of line spectral frequencies that is produced out of order exceeds a second predetermined threshold ;
and the second of the plurality of correction techniques comprises reordering the series of line spectral frequencies that are produced out of order .

US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US6188980B1
CLAIM 16
. A method used by a speech encoder (last frame, replacement frame) that operates on a speech signal , the method comprising : producing from the speech signal a series of line spectral frequencies , at least one of the line spectral frequencies being produced out of order ;
determining the number of line spectral frequencies that are produced out of order ;
and deciding either to reorder the series of line spectral frequencies that are produced out of order , or to replace at least a portion of the series of line spectral frequencies that are produced out of order using at least a portion of one previous series of line spectral frequencies if the number of line spectral frequencies that is produced out of order exceeds a predetermined threshold .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (frame erasure) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6188980B1
CLAIM 1
. A speech encoding system using an analysis by synthesis approach on a speech signal (speech signal, decoder determines concealment) , the speech encoding system comprising : an encoder that generates a series of line spectral frequencies , some of the line spectral frequencies are produced out of order ;
the encoder determines the number of line spectral frequencies that are produced out of order ;
the encoder selectively applies a first of a plurality of correction techniques to process the series of line spectral frequencies that are produced out of order if the number of line spectral frequencies that is produced out of order exceeds a first predetermined threshold ;
the encoder selectively applies a second of the plurality of correction techniques to process the series of line spectral frequencies that are produced out of order if the number of line spectral frequencies that is produced out of order exceeds a second predetermined threshold ;
and the second of the plurality of correction techniques comprises reordering the series of line spectral frequencies that are produced out of order .

US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (frame erasure) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6188980B1
CLAIM 4
. The speech encoding system of claim 1 wherein a third of the plurality of correction techniques comprises frame erasure (frame erasure) .

US6188980B1
CLAIM 16
. A method used by a speech encoder (last frame, replacement frame) that operates on a speech signal , the method comprising : producing from the speech signal a series of line spectral frequencies , at least one of the line spectral frequencies being produced out of order ;
determining the number of line spectral frequencies that are produced out of order ;
and deciding either to reorder the series of line spectral frequencies that are produced out of order , or to replace at least a portion of the series of line spectral frequencies that are produced out of order using at least a portion of one previous series of line spectral frequencies if the number of line spectral frequencies that is produced out of order exceeds a predetermined threshold .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6122608A

Filed: 1998-08-15     Issued: 2000-09-19

Method for switched-predictive quantization

(Original Assignee) Texas Instruments Inc     (Current Assignee) Texas Instruments Inc

Alan V. McCree
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (weighting function) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response (impulse response) up to the end of a last subframe (weighting value) affected by the artificial construction of the periodic part .
US6122608A
CLAIM 5
. The method of claim 4 wherein said squared error is multiplied by a weighting value (last subframe) for each dimension .

US6122608A
CLAIM 6
. The method of claim 5 wherein the weighting function (first impulse, first impulse response) is a Euclidean distance for LSF quantization .

US6122608A
CLAIM 21
. The method of claim 19 wherein said weighting value is determined by the steps of applying an impulse to said LPC filter and running N samples of the LPC synthesis response ;
filtering the samples with a perceptual filter ;
calculating autocorrelation function of weighted impulse response (impulse responses, impulse response, LP filter) ;
computing Jacobian matrix for said LSFs ;
computing correlation of rows of Jacobian matrix ;
and calculating LSF weights by multiplying correlation matrices .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy (second target) of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6122608A
CLAIM 1
. A switched predictive method of quantizing an input signal comprising the steps of : generating a set of parameters associated with said input signal ;
providing a first mean value and subtracting said first mean value from said set of parameters to get first mean-removed input ;
providing a second mean value and subtracting said second mean value from said set of parameters to get second mean-removed input ;
providing a quantizer with a first set of codebooks and second set of codebooks ;
providing a first prediction matrix and a second prediction matrix ;
multiplying a previous frame mean-removed quantized value to said first prediction matrix then said second prediction matrix to get first predicted value and then second predicted value ;
subtracting said first predicted value from said first mean-removed input to get first target value and subtracting said second predicted value from said second mean-removed input to get second target (controlling energy) value ;
applying said first target value to said first set of codebooks to get first quantized target value and applying said second target value to said second set of codebooks to get second quantized target value ;
adding said first predicted value to said first quantized target value to get first mean-removed quantized value and adding said second predicted value to said second quantized target value to get second mean-removed quantized value ;
adding said first mean value to said first mean-removed quantized value to get first quantized value and adding said second mean value to said second mean-removed quantized value to get second quantized value ;
and determining which set of codebooks and prediction matrix has minimum error and selectively providing an output signal representing the quantized value corresponding to that codebook set with minimum error .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6122608A
CLAIM 21
. The method of claim 19 wherein said weighting value is determined by the steps of applying an impulse to said LPC filter and running N samples of the LPC synthesis response ;
filtering the samples with a perceptual filter ;
calculating autocorrelation function of weighted impulse response (impulse responses, impulse response, LP filter) ;
computing Jacobian matrix for said LSFs ;
computing correlation of rows of Jacobian matrix ;
and calculating LSF weights by multiplying correlation matrices .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (impulse response) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse response) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6122608A
CLAIM 21
. The method of claim 19 wherein said weighting value is determined by the steps of applying an impulse to said LPC filter and running N samples of the LPC synthesis response ;
filtering the samples with a perceptual filter ;
calculating autocorrelation function of weighted impulse response (impulse responses, impulse response, LP filter) ;
computing Jacobian matrix for said LSFs ;
computing correlation of rows of Jacobian matrix ;
and calculating LSF weights by multiplying correlation matrices .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse response) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6122608A
CLAIM 21
. The method of claim 19 wherein said weighting value is determined by the steps of applying an impulse to said LPC filter and running N samples of the LPC synthesis response ;
filtering the samples with a perceptual filter ;
calculating autocorrelation function of weighted impulse response (impulse responses, impulse response, LP filter) ;
computing Jacobian matrix for said LSFs ;
computing correlation of rows of Jacobian matrix ;
and calculating LSF weights by multiplying correlation matrices .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (weighting function) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response (impulse response) up to an end of a last subframe (weighting value) affected by the artificial construction of the periodic part .
US6122608A
CLAIM 5
. The method of claim 4 wherein said squared error is multiplied by a weighting value (last subframe) for each dimension .

US6122608A
CLAIM 6
. The method of claim 5 wherein the weighting function (first impulse, first impulse response) is a Euclidean distance for LSF quantization .

US6122608A
CLAIM 21
. The method of claim 19 wherein said weighting value is determined by the steps of applying an impulse to said LPC filter and running N samples of the LPC synthesis response ;
filtering the samples with a perceptual filter ;
calculating autocorrelation function of weighted impulse response (impulse responses, impulse response, LP filter) ;
computing Jacobian matrix for said LSFs ;
computing correlation of rows of Jacobian matrix ;
and calculating LSF weights by multiplying correlation matrices .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6122608A
CLAIM 21
. The method of claim 19 wherein said weighting value is determined by the steps of applying an impulse to said LPC filter and running N samples of the LPC synthesis response ;
filtering the samples with a perceptual filter ;
calculating autocorrelation function of weighted impulse response (impulse responses, impulse response, LP filter) ;
computing Jacobian matrix for said LSFs ;
computing correlation of rows of Jacobian matrix ;
and calculating LSF weights by multiplying correlation matrices .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (impulse response) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse response) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6122608A
CLAIM 21
. The method of claim 19 wherein said weighting value is determined by the steps of applying an impulse to said LPC filter and running N samples of the LPC synthesis response ;
filtering the samples with a perceptual filter ;
calculating autocorrelation function of weighted impulse response (impulse responses, impulse response, LP filter) ;
computing Jacobian matrix for said LSFs ;
computing correlation of rows of Jacobian matrix ;
and calculating LSF weights by multiplying correlation matrices .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse response) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6122608A
CLAIM 21
. The method of claim 19 wherein said weighting value is determined by the steps of applying an impulse to said LPC filter and running N samples of the LPC synthesis response ;
filtering the samples with a perceptual filter ;
calculating autocorrelation function of weighted impulse response (impulse responses, impulse response, LP filter) ;
computing Jacobian matrix for said LSFs ;
computing correlation of rows of Jacobian matrix ;
and calculating LSF weights by multiplying correlation matrices .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JP2000059231A

Filed: 1998-08-10     Issued: 2000-02-25

圧縮音声エラー補償方法およびデータストリーム再生装置

(Original Assignee) Hitachi Ltd; 株式会社日立製作所     

Yukio Fujii, Shinichi Obata, 信一 小畑, 藤井  由紀夫
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (エラーフラグ) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch (上記逆) value from the preceding impulse response up to the end of a last subframe (の誤り検出) affected by the artificial construction of the periodic part .
JP2000059231A
CLAIM 1
【請求項1】圧縮の基本単位である圧縮音声フレームの 先頭を示す同期ワードと、上記圧縮音声フレームがエラ ーを含んでいるかどうかを検出するための誤り検出 (last subframe) ワー ドと、更なる細かいデータ区分である複数のオーディオ データブロックとで構成される上記圧縮音声フレーム、 或いは上記圧縮音声フレームが複数連続した圧縮音声ス トリームを入力し、上記圧縮音声フレーム内に含まれる 同期ワードによって上記圧縮音声フレームの先頭を判別 し、上記1フレーム分が判別された圧縮音声フレームに ついて上記圧縮音声フレーム内に含まれる上記誤り検出 ワードにより、当該圧縮音声がフレームがエラーを含ん でいるかどうかを判別し、上記圧縮オーディオフレーム 内に含まれる上記データブロックの中に含まれる振幅値 と正規化サンプル値を使ってサンプル値を逆量子化し て、帯域合成処理をした結果の音声信号を出力する圧縮 音声再生方法において、 フレームエラーの履歴を保持し、(fn) 番目圧縮音声 フレームにエラーが検出されている場合に( fn は整 数)、(fn -1) 番目圧縮音声フレームの (bn end - 1) 番目ブロック( bn end は1フレーム内の総ブロッ ク数)のデータと、(fn + 1) 番目圧縮音声フレーム の (0) 番目ブロックのデータから(fn)番目圧縮音声 フレームの (0) 番目ブロック〜 (bn end - 1) 番目 ブロックのデータに対するエラー補償データを生成する ことを特徴とした圧縮音声エラー補償方法。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (エラーフラグ) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (の音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JP2000059231A
CLAIM 1
【請求項1】圧縮の基本単位である圧縮音声フレームの 先頭を示す同期ワードと、上記圧縮音声フレームがエラ ーを含んでいるかどうかを検出するための誤り検出ワー ドと、更なる細かいデータ区分である複数のオーディオ データブロックとで構成される上記圧縮音声フレーム、 或いは上記圧縮音声フレームが複数連続した圧縮音声ス トリームを入力し、上記圧縮音声フレーム内に含まれる 同期ワードによって上記圧縮音声フレームの先頭を判別 し、上記1フレーム分が判別された圧縮音声フレームに ついて上記圧縮音声フレーム内に含まれる上記誤り検出 ワードにより、当該圧縮音声がフレームがエラーを含ん でいるかどうかを判別し、上記圧縮オーディオフレーム 内に含まれる上記データブロックの中に含まれる振幅値 と正規化サンプル値を使ってサンプル値を逆量子化し て、帯域合成処理をした結果の音声信号 (speech signal) を出力する圧縮 音声再生方法において、 フレームエラーの履歴を保持し、(fn) 番目圧縮音声 フレームにエラーが検出されている場合に( fn は整 数)、(fn -1) 番目圧縮音声フレームの (bn end - 1) 番目ブロック( bn end は1フレーム内の総ブロッ ク数)のデータと、(fn + 1) 番目圧縮音声フレーム の (0) 番目ブロックのデータから(fn)番目圧縮音声 フレームの (0) 番目ブロック〜 (bn end - 1) 番目 ブロックのデータに対するエラー補償データを生成する ことを特徴とした圧縮音声エラー補償方法。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (の音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment (エラーフラグ) and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JP2000059231A
CLAIM 1
【請求項1】圧縮の基本単位である圧縮音声フレームの 先頭を示す同期ワードと、上記圧縮音声フレームがエラ ーを含んでいるかどうかを検出するための誤り検出ワー ドと、更なる細かいデータ区分である複数のオーディオ データブロックとで構成される上記圧縮音声フレーム、 或いは上記圧縮音声フレームが複数連続した圧縮音声ス トリームを入力し、上記圧縮音声フレーム内に含まれる 同期ワードによって上記圧縮音声フレームの先頭を判別 し、上記1フレーム分が判別された圧縮音声フレームに ついて上記圧縮音声フレーム内に含まれる上記誤り検出 ワードにより、当該圧縮音声がフレームがエラーを含ん でいるかどうかを判別し、上記圧縮オーディオフレーム 内に含まれる上記データブロックの中に含まれる振幅値 と正規化サンプル値を使ってサンプル値を逆量子化し て、帯域合成処理をした結果の音声信号 (speech signal) を出力する圧縮 音声再生方法において、 フレームエラーの履歴を保持し、(fn) 番目圧縮音声 フレームにエラーが検出されている場合に( fn は整 数)、(fn -1) 番目圧縮音声フレームの (bn end - 1) 番目ブロック( bn end は1フレーム内の総ブロッ ク数)のデータと、(fn + 1) 番目圧縮音声フレーム の (0) 番目ブロックのデータから(fn)番目圧縮音声 フレームの (0) 番目ブロック〜 (bn end - 1) 番目 ブロックのデータに対するエラー補償データを生成する ことを特徴とした圧縮音声エラー補償方法。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (の音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2000059231A
CLAIM 1
【請求項1】圧縮の基本単位である圧縮音声フレームの 先頭を示す同期ワードと、上記圧縮音声フレームがエラ ーを含んでいるかどうかを検出するための誤り検出ワー ドと、更なる細かいデータ区分である複数のオーディオ データブロックとで構成される上記圧縮音声フレーム、 或いは上記圧縮音声フレームが複数連続した圧縮音声ス トリームを入力し、上記圧縮音声フレーム内に含まれる 同期ワードによって上記圧縮音声フレームの先頭を判別 し、上記1フレーム分が判別された圧縮音声フレームに ついて上記圧縮音声フレーム内に含まれる上記誤り検出 ワードにより、当該圧縮音声がフレームがエラーを含ん でいるかどうかを判別し、上記圧縮オーディオフレーム 内に含まれる上記データブロックの中に含まれる振幅値 と正規化サンプル値を使ってサンプル値を逆量子化し て、帯域合成処理をした結果の音声信号 (speech signal) を出力する圧縮 音声再生方法において、 フレームエラーの履歴を保持し、(fn) 番目圧縮音声 フレームにエラーが検出されている場合に( fn は整 数)、(fn -1) 番目圧縮音声フレームの (bn end - 1) 番目ブロック( bn end は1フレーム内の総ブロッ ク数)のデータと、(fn + 1) 番目圧縮音声フレーム の (0) 番目ブロックのデータから(fn)番目圧縮音声 フレームの (0) 番目ブロック〜 (bn end - 1) 番目 ブロックのデータに対するエラー補償データを生成する ことを特徴とした圧縮音声エラー補償方法。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment (エラーフラグ) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch (上記逆) value from the preceding impulse response up to an end of a last subframe (の誤り検出) affected by the artificial construction of the periodic part .
JP2000059231A
CLAIM 1
【請求項1】圧縮の基本単位である圧縮音声フレームの 先頭を示す同期ワードと、上記圧縮音声フレームがエラ ーを含んでいるかどうかを検出するための誤り検出 (last subframe) ワー ドと、更なる細かいデータ区分である複数のオーディオ データブロックとで構成される上記圧縮音声フレーム、 或いは上記圧縮音声フレームが複数連続した圧縮音声ス トリームを入力し、上記圧縮音声フレーム内に含まれる 同期ワードによって上記圧縮音声フレームの先頭を判別 し、上記1フレーム分が判別された圧縮音声フレームに ついて上記圧縮音声フレーム内に含まれる上記誤り検出 ワードにより、当該圧縮音声がフレームがエラーを含ん でいるかどうかを判別し、上記圧縮オーディオフレーム 内に含まれる上記データブロックの中に含まれる振幅値 と正規化サンプル値を使ってサンプル値を逆量子化し て、帯域合成処理をした結果の音声信号を出力する圧縮 音声再生方法において、 フレームエラーの履歴を保持し、(fn) 番目圧縮音声 フレームにエラーが検出されている場合に( fn は整 数)、(fn -1) 番目圧縮音声フレームの (bn end - 1) 番目ブロック( bn end は1フレーム内の総ブロッ ク数)のデータと、(fn + 1) 番目圧縮音声フレーム の (0) 番目ブロックのデータから(fn)番目圧縮音声 フレームの (0) 番目ブロック〜 (bn end - 1) 番目 ブロックのデータに対するエラー補償データを生成する ことを特徴とした圧縮音声エラー補償方法。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (エラーフラグ) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (の音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2000059231A
CLAIM 1
【請求項1】圧縮の基本単位である圧縮音声フレームの 先頭を示す同期ワードと、上記圧縮音声フレームがエラ ーを含んでいるかどうかを検出するための誤り検出ワー ドと、更なる細かいデータ区分である複数のオーディオ データブロックとで構成される上記圧縮音声フレーム、 或いは上記圧縮音声フレームが複数連続した圧縮音声ス トリームを入力し、上記圧縮音声フレーム内に含まれる 同期ワードによって上記圧縮音声フレームの先頭を判別 し、上記1フレーム分が判別された圧縮音声フレームに ついて上記圧縮音声フレーム内に含まれる上記誤り検出 ワードにより、当該圧縮音声がフレームがエラーを含ん でいるかどうかを判別し、上記圧縮オーディオフレーム 内に含まれる上記データブロックの中に含まれる振幅値 と正規化サンプル値を使ってサンプル値を逆量子化し て、帯域合成処理をした結果の音声信号 (speech signal) を出力する圧縮 音声再生方法において、 フレームエラーの履歴を保持し、(fn) 番目圧縮音声 フレームにエラーが検出されている場合に( fn は整 数)、(fn -1) 番目圧縮音声フレームの (bn end - 1) 番目ブロック( bn end は1フレーム内の総ブロッ ク数)のデータと、(fn + 1) 番目圧縮音声フレーム の (0) 番目ブロックのデータから(fn)番目圧縮音声 フレームの (0) 番目ブロック〜 (bn end - 1) 番目 ブロックのデータに対するエラー補償データを生成する ことを特徴とした圧縮音声エラー補償方法。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (の音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment (エラーフラグ) and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JP2000059231A
CLAIM 1
【請求項1】圧縮の基本単位である圧縮音声フレームの 先頭を示す同期ワードと、上記圧縮音声フレームがエラ ーを含んでいるかどうかを検出するための誤り検出ワー ドと、更なる細かいデータ区分である複数のオーディオ データブロックとで構成される上記圧縮音声フレーム、 或いは上記圧縮音声フレームが複数連続した圧縮音声ス トリームを入力し、上記圧縮音声フレーム内に含まれる 同期ワードによって上記圧縮音声フレームの先頭を判別 し、上記1フレーム分が判別された圧縮音声フレームに ついて上記圧縮音声フレーム内に含まれる上記誤り検出 ワードにより、当該圧縮音声がフレームがエラーを含ん でいるかどうかを判別し、上記圧縮オーディオフレーム 内に含まれる上記データブロックの中に含まれる振幅値 と正規化サンプル値を使ってサンプル値を逆量子化し て、帯域合成処理をした結果の音声信号 (speech signal) を出力する圧縮 音声再生方法において、 フレームエラーの履歴を保持し、(fn) 番目圧縮音声 フレームにエラーが検出されている場合に( fn は整 数)、(fn -1) 番目圧縮音声フレームの (bn end - 1) 番目ブロック( bn end は1フレーム内の総ブロッ ク数)のデータと、(fn + 1) 番目圧縮音声フレーム の (0) 番目ブロックのデータから(fn)番目圧縮音声 フレームの (0) 番目ブロック〜 (bn end - 1) 番目 ブロックのデータに対するエラー補償データを生成する ことを特徴とした圧縮音声エラー補償方法。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (の音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JP2000059231A
CLAIM 1
【請求項1】圧縮の基本単位である圧縮音声フレームの 先頭を示す同期ワードと、上記圧縮音声フレームがエラ ーを含んでいるかどうかを検出するための誤り検出ワー ドと、更なる細かいデータ区分である複数のオーディオ データブロックとで構成される上記圧縮音声フレーム、 或いは上記圧縮音声フレームが複数連続した圧縮音声ス トリームを入力し、上記圧縮音声フレーム内に含まれる 同期ワードによって上記圧縮音声フレームの先頭を判別 し、上記1フレーム分が判別された圧縮音声フレームに ついて上記圧縮音声フレーム内に含まれる上記誤り検出 ワードにより、当該圧縮音声がフレームがエラーを含ん でいるかどうかを判別し、上記圧縮オーディオフレーム 内に含まれる上記データブロックの中に含まれる振幅値 と正規化サンプル値を使ってサンプル値を逆量子化し て、帯域合成処理をした結果の音声信号 (speech signal) を出力する圧縮 音声再生方法において、 フレームエラーの履歴を保持し、(fn) 番目圧縮音声 フレームにエラーが検出されている場合に( fn は整 数)、(fn -1) 番目圧縮音声フレームの (bn end - 1) 番目ブロック( bn end は1フレーム内の総ブロッ ク数)のデータと、(fn + 1) 番目圧縮音声フレーム の (0) 番目ブロックのデータから(fn)番目圧縮音声 フレームの (0) 番目ブロック〜 (bn end - 1) 番目 ブロックのデータに対するエラー補償データを生成する ことを特徴とした圧縮音声エラー補償方法。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (の音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JP2000059231A
CLAIM 1
【請求項1】圧縮の基本単位である圧縮音声フレームの 先頭を示す同期ワードと、上記圧縮音声フレームがエラ ーを含んでいるかどうかを検出するための誤り検出ワー ドと、更なる細かいデータ区分である複数のオーディオ データブロックとで構成される上記圧縮音声フレーム、 或いは上記圧縮音声フレームが複数連続した圧縮音声ス トリームを入力し、上記圧縮音声フレーム内に含まれる 同期ワードによって上記圧縮音声フレームの先頭を判別 し、上記1フレーム分が判別された圧縮音声フレームに ついて上記圧縮音声フレーム内に含まれる上記誤り検出 ワードにより、当該圧縮音声がフレームがエラーを含ん でいるかどうかを判別し、上記圧縮オーディオフレーム 内に含まれる上記データブロックの中に含まれる振幅値 と正規化サンプル値を使ってサンプル値を逆量子化し て、帯域合成処理をした結果の音声信号 (speech signal) を出力する圧縮 音声再生方法において、 フレームエラーの履歴を保持し、(fn) 番目圧縮音声 フレームにエラーが検出されている場合に( fn は整 数)、(fn -1) 番目圧縮音声フレームの (bn end - 1) 番目ブロック( bn end は1フレーム内の総ブロッ ク数)のデータと、(fn + 1) 番目圧縮音声フレーム の (0) 番目ブロックのデータから(fn)番目圧縮音声 フレームの (0) 番目ブロック〜 (bn end - 1) 番目 ブロックのデータに対するエラー補償データを生成する ことを特徴とした圧縮音声エラー補償方法。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH1198090A

Filed: 1998-07-24     Issued: 1999-04-09

音声符号化/復号化装置

(Original Assignee) Nec Corp; 日本電気株式会社     

Kiyoko Tanaka, 聖子 田中
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (入力音) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch (per) value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JPH1198090A
CLAIM 1
【請求項1】 入力音 (first impulse) 声信号に対して有声・無声区間を 判別して有声・無声に応じた識別制御信号を出力する有 声・無声判別部と、前記識別制御信号により有声のとき に入力される前記入力音声信号に対して線形予測分析法 に基づいて合成フィルタのフィルタ係数を算出すること でLPC(Linear Predictive Co ding)パラメータを取得すると共に、該LPCパラ メータをLSP(Line Spectrum Pai r)パラメータに換算するLPC分析部と、前記識別制 御信号により有声から無声に切り替わったときに直前の 有声時における前記LPCパラメータを一時蓄積するL PC蓄積部と、前記LPCパラメータに基づいて無声の ときの雑音特性を有声のときの雑音特性に近付けて前記 線形予測分析法に供する背景雑音を生成するための濾波 を行うLPFと、前記LSPパラメータに基づいて符号 化処理を行って符号化音声信号又は雑音信号を出力する 高能率符号化処理部と、前記識別制御信号に応じて有声 のときに前記符号化音声信号,無声ときに前記雑音信号 をそれぞれ出力符号化信号として切り替え送出するスイ ッチ制御部とを備えたことを特徴とする音声符号化装 置。

JPH1198090A
CLAIM 7
【請求項7】 請求項1〜6の何れか一つに記載の音声 符号化装置において、前記LPFは、送信者による発声 を示す有声又は無発声を示す無声を識別するVOX(V oice Oper (average pitch) ated Transmitte r)機能を有すると共に、CELP(Code−boo k Excited Linear Predicti on)方式に基づいて無声に伴う雑音のスペクトル特性 を音声に伴う雑音のスペクトル特性に近似して前記背景 雑音を生成出力することを特徴とする音声符号化装置。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (の雑音) and the first non erased frame received after frame erasure is encoded as active speech .
JPH1198090A
CLAIM 1
【請求項1】 入力音声信号に対して有声・無声区間を 判別して有声・無声に応じた識別制御信号を出力する有 声・無声判別部と、前記識別制御信号により有声のとき に入力される前記入力音声信号に対して線形予測分析法 に基づいて合成フィルタのフィルタ係数を算出すること でLPC(Linear Predictive Co ding)パラメータを取得すると共に、該LPCパラ メータをLSP(Line Spectrum Pai r)パラメータに換算するLPC分析部と、前記識別制 御信号により有声から無声に切り替わったときに直前の 有声時における前記LPCパラメータを一時蓄積するL PC蓄積部と、前記LPCパラメータに基づいて無声の ときの雑音 (comfort noise) 特性を有声のときの雑音特性に近付けて前記 線形予測分析法に供する背景雑音を生成するための濾波 を行うLPFと、前記LSPパラメータに基づいて符号 化処理を行って符号化音声信号又は雑音信号を出力する 高能率符号化処理部と、前記識別制御信号に応じて有声 のときに前記符号化音声信号,無声ときに前記雑音信号 をそれぞれ出力符号化信号として切り替え送出するスイ ッチ制御部とを備えたことを特徴とする音声符号化装 置。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (入力音) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch (per) value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JPH1198090A
CLAIM 1
【請求項1】 入力音 (first impulse) 声信号に対して有声・無声区間を 判別して有声・無声に応じた識別制御信号を出力する有 声・無声判別部と、前記識別制御信号により有声のとき に入力される前記入力音声信号に対して線形予測分析法 に基づいて合成フィルタのフィルタ係数を算出すること でLPC(Linear Predictive Co ding)パラメータを取得すると共に、該LPCパラ メータをLSP(Line Spectrum Pai r)パラメータに換算するLPC分析部と、前記識別制 御信号により有声から無声に切り替わったときに直前の 有声時における前記LPCパラメータを一時蓄積するL PC蓄積部と、前記LPCパラメータに基づいて無声の ときの雑音特性を有声のときの雑音特性に近付けて前記 線形予測分析法に供する背景雑音を生成するための濾波 を行うLPFと、前記LSPパラメータに基づいて符号 化処理を行って符号化音声信号又は雑音信号を出力する 高能率符号化処理部と、前記識別制御信号に応じて有声 のときに前記符号化音声信号,無声ときに前記雑音信号 をそれぞれ出力符号化信号として切り替え送出するスイ ッチ制御部とを備えたことを特徴とする音声符号化装 置。

JPH1198090A
CLAIM 7
【請求項7】 請求項1〜6の何れか一つに記載の音声 符号化装置において、前記LPFは、送信者による発声 を示す有声又は無発声を示す無声を識別するVOX(V oice Oper (average pitch) ated Transmitte r)機能を有すると共に、CELP(Code−boo k Excited Linear Predicti on)方式に基づいて無声に伴う雑音のスペクトル特性 を音声に伴う雑音のスペクトル特性に近似して前記背景 雑音を生成出力することを特徴とする音声符号化装置。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (の雑音) and the first non erased frame received after frame erasure is encoded as active speech .
JPH1198090A
CLAIM 1
【請求項1】 入力音声信号に対して有声・無声区間を 判別して有声・無声に応じた識別制御信号を出力する有 声・無声判別部と、前記識別制御信号により有声のとき に入力される前記入力音声信号に対して線形予測分析法 に基づいて合成フィルタのフィルタ係数を算出すること でLPC(Linear Predictive Co ding)パラメータを取得すると共に、該LPCパラ メータをLSP(Line Spectrum Pai r)パラメータに換算するLPC分析部と、前記識別制 御信号により有声から無声に切り替わったときに直前の 有声時における前記LPCパラメータを一時蓄積するL PC蓄積部と、前記LPCパラメータに基づいて無声の ときの雑音 (comfort noise) 特性を有声のときの雑音特性に近付けて前記 線形予測分析法に供する背景雑音を生成するための濾波 を行うLPFと、前記LSPパラメータに基づいて符号 化処理を行って符号化音声信号又は雑音信号を出力する 高能率符号化処理部と、前記識別制御信号に応じて有声 のときに前記符号化音声信号,無声ときに前記雑音信号 をそれぞれ出力符号化信号として切り替え送出するスイ ッチ制御部とを備えたことを特徴とする音声符号化装 置。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6081776A

Filed: 1998-07-13     Issued: 2000-06-27

Speech coding system and method including adaptive finite impulse response filter

(Original Assignee) Lockheed Martin Corp     (Current Assignee) Lockheed Martin Corp

Mark Lewis Grabb, Steven Robert Koch, Glen William Brooksby, Richard Louis Zinser, Jr.
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse response) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6081776A
CLAIM 1
. A spectral frequency quantizer comprising : a digital signal processor adapted to operate , at least in part , as a finite impulse response (impulse responses, impulse response, LP filter) filter , said finite impulse response filter comprising an input for receiving line spectral frequencies , and an output for providing smoothed quantized line spectral frequencies , said finite impulse response filter characterized by the transfer function : H(z)=0 . 5+Δ+(0 . 5-Δ)z . sup . -1 wherein Δ is determined by the formula : Δ=min(0 . 5 , γ|ƒ(n)-ƒ(n-1)|) and wherein ƒ(n) denotes a present input to said finite impulse response filter , and wherein ƒ(n-1) denotes a previous input to said finite impulse response filter .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6081776A
CLAIM 1
. A spectral frequency quantizer comprising : a digital signal processor adapted to operate , at least in part , as a finite impulse response (impulse responses, impulse response, LP filter) filter , said finite impulse response filter comprising an input for receiving line spectral frequencies , and an output for providing smoothed quantized line spectral frequencies , said finite impulse response filter characterized by the transfer function : H(z)=0 . 5+Δ+(0 . 5-Δ)z . sup . -1 wherein Δ is determined by the formula : Δ=min(0 . 5 , γ|ƒ(n)-ƒ(n-1)|) and wherein ƒ(n) denotes a present input to said finite impulse response filter , and wherein ƒ(n-1) denotes a previous input to said finite impulse response filter .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (impulse response) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse response) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6081776A
CLAIM 1
. A spectral frequency quantizer comprising : a digital signal processor adapted to operate , at least in part , as a finite impulse response (impulse responses, impulse response, LP filter) filter , said finite impulse response filter comprising an input for receiving line spectral frequencies , and an output for providing smoothed quantized line spectral frequencies , said finite impulse response filter characterized by the transfer function : H(z)=0 . 5+Δ+(0 . 5-Δ)z . sup . -1 wherein Δ is determined by the formula : Δ=min(0 . 5 , γ|ƒ(n)-ƒ(n-1)|) and wherein ƒ(n) denotes a present input to said finite impulse response filter , and wherein ƒ(n-1) denotes a previous input to said finite impulse response filter .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse response) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6081776A
CLAIM 1
. A spectral frequency quantizer comprising : a digital signal processor adapted to operate , at least in part , as a finite impulse response (impulse responses, impulse response, LP filter) filter , said finite impulse response filter comprising an input for receiving line spectral frequencies , and an output for providing smoothed quantized line spectral frequencies , said finite impulse response filter characterized by the transfer function : H(z)=0 . 5+Δ+(0 . 5-Δ)z . sup . -1 wherein Δ is determined by the formula : Δ=min(0 . 5 , γ|ƒ(n)-ƒ(n-1)|) and wherein ƒ(n) denotes a present input to said finite impulse response filter , and wherein ƒ(n-1) denotes a previous input to said finite impulse response filter .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse response) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6081776A
CLAIM 1
. A spectral frequency quantizer comprising : a digital signal processor adapted to operate , at least in part , as a finite impulse response (impulse responses, impulse response, LP filter) filter , said finite impulse response filter comprising an input for receiving line spectral frequencies , and an output for providing smoothed quantized line spectral frequencies , said finite impulse response filter characterized by the transfer function : H(z)=0 . 5+Δ+(0 . 5-Δ)z . sup . -1 wherein Δ is determined by the formula : Δ=min(0 . 5 , γ|ƒ(n)-ƒ(n-1)|) and wherein ƒ(n) denotes a present input to said finite impulse response filter , and wherein ƒ(n-1) denotes a previous input to said finite impulse response filter .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6081776A
CLAIM 1
. A spectral frequency quantizer comprising : a digital signal processor adapted to operate , at least in part , as a finite impulse response (impulse responses, impulse response, LP filter) filter , said finite impulse response filter comprising an input for receiving line spectral frequencies , and an output for providing smoothed quantized line spectral frequencies , said finite impulse response filter characterized by the transfer function : H(z)=0 . 5+Δ+(0 . 5-Δ)z . sup . -1 wherein Δ is determined by the formula : Δ=min(0 . 5 , γ|ƒ(n)-ƒ(n-1)|) and wherein ƒ(n) denotes a present input to said finite impulse response filter , and wherein ƒ(n-1) denotes a previous input to said finite impulse response filter .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (impulse response) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse response) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6081776A
CLAIM 1
. A spectral frequency quantizer comprising : a digital signal processor adapted to operate , at least in part , as a finite impulse response (impulse responses, impulse response, LP filter) filter , said finite impulse response filter comprising an input for receiving line spectral frequencies , and an output for providing smoothed quantized line spectral frequencies , said finite impulse response filter characterized by the transfer function : H(z)=0 . 5+Δ+(0 . 5-Δ)z . sup . -1 wherein Δ is determined by the formula : Δ=min(0 . 5 , γ|ƒ(n)-ƒ(n-1)|) and wherein ƒ(n) denotes a present input to said finite impulse response filter , and wherein ƒ(n-1) denotes a previous input to said finite impulse response filter .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse response) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6081776A
CLAIM 1
. A spectral frequency quantizer comprising : a digital signal processor adapted to operate , at least in part , as a finite impulse response (impulse responses, impulse response, LP filter) filter , said finite impulse response filter comprising an input for receiving line spectral frequencies , and an output for providing smoothed quantized line spectral frequencies , said finite impulse response filter characterized by the transfer function : H(z)=0 . 5+Δ+(0 . 5-Δ)z . sup . -1 wherein Δ is determined by the formula : Δ=min(0 . 5 , γ|ƒ(n)-ƒ(n-1)|) and wherein ƒ(n) denotes a present input to said finite impulse response filter , and wherein ƒ(n-1) denotes a previous input to said finite impulse response filter .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
CN1243615A

Filed: 1998-07-09     Issued: 2000-02-02

用于控制信号的振幅电平的方法和装置

(Original Assignee) 艾利森公司     

M·A·琼斯
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response (该输出信号, 输入信) up to the end of a last subframe affected by the artificial construction of the periodic part .
CN1243615A
CLAIM 1
. 一种振幅控制电路,其中包括:接收输入信 (preceding impulse response) 号并根据该输入信号产生一输出信号的可变增益放大器;接收该输出信号 (preceding impulse response) 并确定相应均方信号的信号处理器;以及一分析器,其把该均方信号与一参考值相比较并产生连接到可变增益放大器的反馈控制信号,用于根据均方信号与参考值之间的差别控制可变增益放大器的增益,其特征在于,可变增益放大器的增益被控制使得输出信号的振幅保持在所需的振幅电平上。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response (该输出信号, 输入信) up to an end of a last subframe affected by the artificial construction of the periodic part .
CN1243615A
CLAIM 1
. 一种振幅控制电路,其中包括:接收输入信 (preceding impulse response) 号并根据该输入信号产生一输出信号的可变增益放大器;接收该输出信号 (preceding impulse response) 并确定相应均方信号的信号处理器;以及一分析器,其把该均方信号与一参考值相比较并产生连接到可变增益放大器的反馈控制信号,用于根据均方信号与参考值之间的差别控制可变增益放大器的增益,其特征在于,可变增益放大器的增益被控制使得输出信号的振幅保持在所需的振幅电平上。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
WO9913570A1

Filed: 1998-07-09     Issued: 1999-03-18

Method and apparatus for controlling signal amplitude level

(Original Assignee) Ericsson Inc.     

Mark A. Jones
US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
WO9913570A1
CLAIM 1
. An amplitude control circuit , comprising : a variable gain amplifier receiving an input signal and generating an output signal corresponding to the input signal ;
a signal processor receiving the output signal and determining a corresponding mean squared signal ;
and an analyzer comparing the mean squared signal with a reference and generating a feedback control signal (maximum amplitude) connected to the variable gain amplifier for controlling the gain of the variable gain amplifier in accordance with a difference between the mean squared signal and the reference value , wherein the gain of the variable gain amplifier is controlled so that the amplitude of the output signal is maintained at a desired amplitude level .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (input terminal) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
WO9913570A1
CLAIM 13
. The method in claim 12 . wherein the squaring and averaging steps c (LP filter) orrespond to generating a mean square of the output signal of the variable gain amplifier .

WO9913570A1
CLAIM 16
. An amplitude control circuit , comprising : a differential amplifier having a pair of differential signal input terminal (LP filter excitation signal) s connected to biasing terminals of a pair of connected transistors ;
a variable gain amplifier circuit receiving differential signals produced by the differential amplifier and generating a differential output signal at a pair of differential signal output terminals ;
a multiplier circuit for squaring the differential output signal ;
and a current mirror connected to a reference current and to an output of the multiplier circuit for generating a gain control signal connected to the variable gain amplifier .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (steps c) excitation signal (input terminal) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
WO9913570A1
CLAIM 13
. The method in claim 12 . wherein the squaring and averaging steps c (LP filter) orrespond to generating a mean square of the output signal of the variable gain amplifier .

WO9913570A1
CLAIM 16
. An amplitude control circuit , comprising : a differential amplifier having a pair of differential signal input terminal (LP filter excitation signal) s connected to biasing terminals of a pair of connected transistors ;
a variable gain amplifier circuit receiving differential signals produced by the differential amplifier and generating a differential output signal at a pair of differential signal output terminals ;
a multiplier circuit for squaring the differential output signal ;
and a current mirror connected to a reference current and to an output of the multiplier circuit for generating a gain control signal connected to the variable gain amplifier .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
WO9913570A1
CLAIM 1
. An amplitude control circuit , comprising : a variable gain amplifier receiving an input signal and generating an output signal corresponding to the input signal ;
a signal processor receiving the output signal and determining a corresponding mean squared signal ;
and an analyzer comparing the mean squared signal with a reference and generating a feedback control signal (maximum amplitude) connected to the variable gain amplifier for controlling the gain of the variable gain amplifier in accordance with a difference between the mean squared signal and the reference value , wherein the gain of the variable gain amplifier is controlled so that the amplitude of the output signal is maintained at a desired amplitude level .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (input terminal) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
WO9913570A1
CLAIM 13
. The method in claim 12 . wherein the squaring and averaging steps c (LP filter) orrespond to generating a mean square of the output signal of the variable gain amplifier .

WO9913570A1
CLAIM 16
. An amplitude control circuit , comprising : a differential amplifier having a pair of differential signal input terminal (LP filter excitation signal) s connected to biasing terminals of a pair of connected transistors ;
a variable gain amplifier circuit receiving differential signals produced by the differential amplifier and generating a differential output signal at a pair of differential signal output terminals ;
a multiplier circuit for squaring the differential output signal ;
and a current mirror connected to a reference current and to an output of the multiplier circuit for generating a gain control signal connected to the variable gain amplifier .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
WO9913570A1
CLAIM 1
. An amplitude control circuit , comprising : a variable gain amplifier receiving an input signal and generating an output signal corresponding to the input signal ;
a signal processor receiving the output signal and determining a corresponding mean squared signal ;
and an analyzer comparing the mean squared signal with a reference and generating a feedback control signal (maximum amplitude) connected to the variable gain amplifier for controlling the gain of the variable gain amplifier in accordance with a difference between the mean squared signal and the reference value , wherein the gain of the variable gain amplifier is controlled so that the amplitude of the output signal is maintained at a desired amplitude level .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (input terminal) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
WO9913570A1
CLAIM 13
. The method in claim 12 . wherein the squaring and averaging steps c (LP filter) orrespond to generating a mean square of the output signal of the variable gain amplifier .

WO9913570A1
CLAIM 16
. An amplitude control circuit , comprising : a differential amplifier having a pair of differential signal input terminal (LP filter excitation signal) s connected to biasing terminals of a pair of connected transistors ;
a variable gain amplifier circuit receiving differential signals produced by the differential amplifier and generating a differential output signal at a pair of differential signal output terminals ;
a multiplier circuit for squaring the differential output signal ;
and a current mirror connected to a reference current and to an output of the multiplier circuit for generating a gain control signal connected to the variable gain amplifier .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (steps c) excitation signal (input terminal) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
WO9913570A1
CLAIM 13
. The method in claim 12 . wherein the squaring and averaging steps c (LP filter) orrespond to generating a mean square of the output signal of the variable gain amplifier .

WO9913570A1
CLAIM 16
. An amplitude control circuit , comprising : a differential amplifier having a pair of differential signal input terminal (LP filter excitation signal) s connected to biasing terminals of a pair of connected transistors ;
a variable gain amplifier circuit receiving differential signals produced by the differential amplifier and generating a differential output signal at a pair of differential signal output terminals ;
a multiplier circuit for squaring the differential output signal ;
and a current mirror connected to a reference current and to an output of the multiplier circuit for generating a gain control signal connected to the variable gain amplifier .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
WO9913570A1
CLAIM 1
. An amplitude control circuit , comprising : a variable gain amplifier receiving an input signal and generating an output signal corresponding to the input signal ;
a signal processor receiving the output signal and determining a corresponding mean squared signal ;
and an analyzer comparing the mean squared signal with a reference and generating a feedback control signal (maximum amplitude) connected to the variable gain amplifier for controlling the gain of the variable gain amplifier in accordance with a difference between the mean squared signal and the reference value , wherein the gain of the variable gain amplifier is controlled so that the amplitude of the output signal is maintained at a desired amplitude level .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (input terminal) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
WO9913570A1
CLAIM 13
. The method in claim 12 . wherein the squaring and averaging steps c (LP filter) orrespond to generating a mean square of the output signal of the variable gain amplifier .

WO9913570A1
CLAIM 16
. An amplitude control circuit , comprising : a differential amplifier having a pair of differential signal input terminal (LP filter excitation signal) s connected to biasing terminals of a pair of connected transistors ;
a variable gain amplifier circuit receiving differential signals produced by the differential amplifier and generating a differential output signal at a pair of differential signal output terminals ;
a multiplier circuit for squaring the differential output signal ;
and a current mirror connected to a reference current and to an output of the multiplier circuit for generating a gain control signal connected to the variable gain amplifier .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6115687A

Filed: 1998-07-01     Issued: 2000-09-05

Sound reproducing speed converter

(Original Assignee) Panasonic Corp     (Current Assignee) III Holdings 12 LLC

Naoya Tanaka, Hiroaki Takeda
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (pitch period) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response (inverse filter) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6115687A
CLAIM 4
. The apparatus for converting a voice reproducing rate according to claim 3 , wherein said apparatus comprising : linear predictive analysis means for calculating a linear predictive coefficients representing spectrum information of said input voice signal ;
inverse filter (impulse response) for calculating said prediction residual signal from said input voice signal using the calculated linear predictive coefficients ;
and synthesis filter for synthesizing a voice signal from a synthesis residual signal output from said waveform synthesis means using said linear predictive coefficients .

US6115687A
CLAIM 6
. The apparatus for converting a voice reproducing rate according to claim 1 , wherein said apparatus executes rate conversion processing using output information of a voice coding apparatus for coding a voice signal by dividing it into a linear predictive coefficients representing spectrum information , pitch period (decoder concealment, pitch period, decoder determines concealment) information and voice source information representing a predictive residual .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (pitch period) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6115687A
CLAIM 6
. The apparatus for converting a voice reproducing rate according to claim 1 , wherein said apparatus executes rate conversion processing using output information of a voice coding apparatus for coding a voice signal by dividing it into a linear predictive coefficients representing spectrum information , pitch period (decoder concealment, pitch period, decoder determines concealment) information and voice source information representing a predictive residual .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (inverse filter) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6115687A
CLAIM 4
. The apparatus for converting a voice reproducing rate according to claim 3 , wherein said apparatus comprising : linear predictive analysis means for calculating a linear predictive coefficients representing spectrum information of said input voice signal ;
inverse filter (impulse response) for calculating said prediction residual signal from said input voice signal using the calculated linear predictive coefficients ;
and synthesis filter for synthesizing a voice signal from a synthesis residual signal output from said waveform synthesis means using said linear predictive coefficients .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (pitch period) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6115687A
CLAIM 6
. The apparatus for converting a voice reproducing rate according to claim 1 , wherein said apparatus executes rate conversion processing using output information of a voice coding apparatus for coding a voice signal by dividing it into a linear predictive coefficients representing spectrum information , pitch period (decoder concealment, pitch period, decoder determines concealment) information and voice source information representing a predictive residual .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (inverse filter) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6115687A
CLAIM 4
. The apparatus for converting a voice reproducing rate according to claim 3 , wherein said apparatus comprising : linear predictive analysis means for calculating a linear predictive coefficients representing spectrum information of said input voice signal ;
inverse filter (impulse response) for calculating said prediction residual signal from said input voice signal using the calculated linear predictive coefficients ;
and synthesis filter for synthesizing a voice signal from a synthesis residual signal output from said waveform synthesis means using said linear predictive coefficients .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (pitch period) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response (inverse filter) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6115687A
CLAIM 4
. The apparatus for converting a voice reproducing rate according to claim 3 , wherein said apparatus comprising : linear predictive analysis means for calculating a linear predictive coefficients representing spectrum information of said input voice signal ;
inverse filter (impulse response) for calculating said prediction residual signal from said input voice signal using the calculated linear predictive coefficients ;
and synthesis filter for synthesizing a voice signal from a synthesis residual signal output from said waveform synthesis means using said linear predictive coefficients .

US6115687A
CLAIM 6
. The apparatus for converting a voice reproducing rate according to claim 1 , wherein said apparatus executes rate conversion processing using output information of a voice coding apparatus for coding a voice signal by dividing it into a linear predictive coefficients representing spectrum information , pitch period (decoder concealment, pitch period, decoder determines concealment) information and voice source information representing a predictive residual .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (pitch period) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6115687A
CLAIM 6
. The apparatus for converting a voice reproducing rate according to claim 1 , wherein said apparatus executes rate conversion processing using output information of a voice coding apparatus for coding a voice signal by dividing it into a linear predictive coefficients representing spectrum information , pitch period (decoder concealment, pitch period, decoder determines concealment) information and voice source information representing a predictive residual .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (inverse filter) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6115687A
CLAIM 4
. The apparatus for converting a voice reproducing rate according to claim 3 , wherein said apparatus comprising : linear predictive analysis means for calculating a linear predictive coefficients representing spectrum information of said input voice signal ;
inverse filter (impulse response) for calculating said prediction residual signal from said input voice signal using the calculated linear predictive coefficients ;
and synthesis filter for synthesizing a voice signal from a synthesis residual signal output from said waveform synthesis means using said linear predictive coefficients .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (pitch period) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6115687A
CLAIM 6
. The apparatus for converting a voice reproducing rate according to claim 1 , wherein said apparatus executes rate conversion processing using output information of a voice coding apparatus for coding a voice signal by dividing it into a linear predictive coefficients representing spectrum information , pitch period (decoder concealment, pitch period, decoder determines concealment) information and voice source information representing a predictive residual .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (inverse filter) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6115687A
CLAIM 4
. The apparatus for converting a voice reproducing rate according to claim 3 , wherein said apparatus comprising : linear predictive analysis means for calculating a linear predictive coefficients representing spectrum information of said input voice signal ;
inverse filter (impulse response) for calculating said prediction residual signal from said input voice signal using the calculated linear predictive coefficients ;
and synthesis filter for synthesizing a voice signal from a synthesis residual signal output from said waveform synthesis means using said linear predictive coefficients .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6029126A

Filed: 1998-06-30     Issued: 2000-02-22

Scalable audio coder and decoder

(Original Assignee) Microsoft Corp     (Current Assignee) Microsoft Technology Licensing LLC

Henrique S. Malvar
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (weighting function) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6029126A
CLAIM 13
. A computer implemented method for encoding an input signal comprising : receiving the input signal and computing a modulated lapped transform ;
modifying the modulated lapped transform to create a nonuniform modulated lapped biorthogonal transform with transform coefficients ;
and computing weighting function (first impulse, first impulse response) s having auditory masking capabilities and applying the weighting functions to the transform coefficients of the nonuniform modulated lapped biorthogonal transform .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs (entropy encoder) , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (weighting function) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6029126A
CLAIM 3
. The coder of claim 1 , further comprising an entropy encoder (decoder constructs) for encoding the transform coefficients .

US6029126A
CLAIM 13
. A computer implemented method for encoding an input signal comprising : receiving the input signal and computing a modulated lapped transform ;
modifying the modulated lapped transform to create a nonuniform modulated lapped biorthogonal transform with transform coefficients ;
and computing weighting function (first impulse, first impulse response) s having auditory masking capabilities and applying the weighting functions to the transform coefficients of the nonuniform modulated lapped biorthogonal transform .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6141638A

Filed: 1998-05-28     Issued: 2000-10-31

Method and apparatus for coding an information signal

(Original Assignee) Motorola Solutions Inc     (Current Assignee) Google Technology Holdings LLC

Weimin Peng, James Patrick Ashley
US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter , and a phase information parameter (predetermined parameters) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter , and a phase information parameter (predetermined parameters) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter , and a phase information parameter (predetermined parameters) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US6141638A
CLAIM 2
. The method of claim 1 , wherein the information signal further comprises either a speech signal (speech signal, decoder determines concealment) , video signal or an audio signal .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter , and a phase information parameter (predetermined parameters) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6141638A
CLAIM 2
. The method of claim 1 , wherein the information signal further comprises either a speech signal (speech signal, decoder determines concealment) , video signal or an audio signal .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6141638A
CLAIM 2
. The method of claim 1 , wherein the information signal further comprises either a speech signal (speech signal, decoder determines concealment) , video signal or an audio signal .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter , and a phase information parameter (predetermined parameters) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter and a phase information parameter (predetermined parameters) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter and a phase information parameter (predetermined parameters) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter and a phase information parameter (predetermined parameters) related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter and a phase information parameter (predetermined parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter and a phase information parameter (predetermined parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter and a phase information parameter (predetermined parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US6141638A
CLAIM 2
. The method of claim 1 , wherein the information signal further comprises either a speech signal (speech signal, decoder determines concealment) , video signal or an audio signal .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter and a phase information parameter (predetermined parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6141638A
CLAIM 2
. The method of claim 1 , wherein the information signal further comprises either a speech signal (speech signal, decoder determines concealment) , video signal or an audio signal .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6141638A
CLAIM 2
. The method of claim 1 , wherein the information signal further comprises either a speech signal (speech signal, decoder determines concealment) , video signal or an audio signal .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter and a phase information parameter (predetermined parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter and a phase information parameter (predetermined parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter and a phase information parameter (predetermined parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter and a phase information parameter (predetermined parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .

US6141638A
CLAIM 2
. The method of claim 1 , wherein the information signal further comprises either a speech signal (speech signal, decoder determines concealment) , video signal or an audio signal .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined parameters) , an energy information parameter and a phase information parameter (predetermined parameters) related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6141638A
CLAIM 1
. A method of coding an information signal comprising the steps of : selecting one of a plurality of configurations based on predetermined parameters (signal classification parameter, phase information parameter) related to the information signal , each of the plurality of configurations having a codebook ;
searching the codebook over a length of a codevector which is shorter than a subframe length to determine a codebook index from the codebook corresponding to the selected configuration ;
and transmitting the predetermined parameters and the codebook index to a destination .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6108626A

Filed: 1998-05-14     Issued: 2000-08-22

Object oriented audio coding

(Original Assignee) Robert Bosch GmbH; Centro Studi e Laboratori Telecomunicazioni SpA (CSELT)     (Current Assignee) CSELT- CENTRO STUDI E LABORATORI TELECOMUNICAZIONI SpA ; Robert Bosch GmbH ; Centro Studi e Laboratori Telecomunicazioni SpA (CSELT) ; Nuance Communications Inc

Luca Cellario, Michele Festa, Jorg Muller, Daniele Sereno
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (predetermined bandwidth) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6108626A
CLAIM 8
. A method as claimed in claim 1 , characterized in that said frequency bands have a predetermined bandwidth (pitch period) , independently of a sampling frequency of the signal to be coded .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (predetermined bandwidth) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6108626A
CLAIM 8
. A method as claimed in claim 1 , characterized in that said frequency bands have a predetermined bandwidth (pitch period) , independently of a sampling frequency of the signal to be coded .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy (lowest frequency, coding devices) for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6108626A
CLAIM 6
. A method as claimed in claim 1 , characterized in that the selection of the first and second algorithm is carried out in dependence of configuration information passed from a user equipment (US) to coding devices (signal energy, LP filter, LP filter excitation signal) (AC) and/or of control information passed from a transmission system (SY) to the coding devices .

US6108626A
CLAIM 48
. An apparatus as claimed in claim 47 , characterized in that the combination means (BCU) are arranged to transmit the bit packets within a macro-object bit stream (OB11 . . . OB21) in an order of frequency band , starting with the lowest frequency (signal energy, LP filter, LP filter excitation signal) band .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy (desired bit rate) of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6108626A
CLAIM 23
. A method as claimed in claim 22 , characterized in that said scaling comprises the following steps : a1) determining a bandwidth allocable in the frame to the or each macro-object for a desired bit rate (controlling energy) ;
b1) eliminating bit packets relevant to frequency bands which cause an exceeding of said bandwidth ;
c1) if the residual bit rate exceeds the desired bit rate , eliminating one block of enhancement information for each band , starting from the band with the highest frequency , until the desired bit rate is attained or the core information only is left , the elimination being cyclically repeated , if necessary ;
d1) if the residual bit rate at the end of step c1) still exceeds the desired bit rate , eliminating core packets of one or more frequency bands , starting from the highest frequency one .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (said processing unit, predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (other bands) and the first non erased frame received after frame erasure is encoded as active speech .
US6108626A
CLAIM 36
. An apparatus as claimed in claim 32 , characterized in that the first and second coding units (LCC , HCC , LEC , HEC) of each band are configurable independently of the coding units of the other bands (comfort noise) .

US6108626A
CLAIM 46
. An apparatus as claimed in claim 44 characterized in that said means (BCL) for the quality increase evaluation exploit information on a perceptual model provided by said processing unit (last non) (PMP) .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (lowest frequency, coding devices) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6108626A
CLAIM 6
. A method as claimed in claim 1 , characterized in that the selection of the first and second algorithm is carried out in dependence of configuration information passed from a user equipment (US) to coding devices (signal energy, LP filter, LP filter excitation signal) (AC) and/or of control information passed from a transmission system (SY) to the coding devices .

US6108626A
CLAIM 48
. An apparatus as claimed in claim 47 , characterized in that the combination means (BCU) are arranged to transmit the bit packets within a macro-object bit stream (OB11 . . . OB21) in an order of frequency band , starting with the lowest frequency (signal energy, LP filter, LP filter excitation signal) band .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (lowest frequency, coding devices) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (given frequency) , E LPO is an energy of an impulse response of the LP filter of a last non (said processing unit, predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6108626A
CLAIM 6
. A method as claimed in claim 1 , characterized in that the selection of the first and second algorithm is carried out in dependence of configuration information passed from a user equipment (US) to coding devices (signal energy, LP filter, LP filter excitation signal) (AC) and/or of control information passed from a transmission system (SY) to the coding devices .

US6108626A
CLAIM 11
. A method as claimed in claim 1 , characterized in that the selection of the frequency bands to be submitted to at least the first coding step , the selection of the bands for which also second coding steps are to be performed and the number of second coding steps for a given frequency (current frame) band are determined in dependency of the bandwidth and bit rate desired for the coded signal and on requirements of a user equipment (US) and of a system (SY) in which the coded signal is exploited , independently of the bandwidth and sampling frequency of the signal to be coded , on a frame per frame basis .

US6108626A
CLAIM 46
. An apparatus as claimed in claim 44 characterized in that said means (BCL) for the quality increase evaluation exploit information on a perceptual model provided by said processing unit (last non) (PMP) .

US6108626A
CLAIM 48
. An apparatus as claimed in claim 47 , characterized in that the combination means (BCU) are arranged to transmit the bit packets within a macro-object bit stream (OB11 . . . OB21) in an order of frequency band , starting with the lowest frequency (signal energy, LP filter, LP filter excitation signal) band .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (predetermined bandwidth) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6108626A
CLAIM 8
. A method as claimed in claim 1 , characterized in that said frequency bands have a predetermined bandwidth (pitch period) , independently of a sampling frequency of the signal to be coded .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (lowest frequency, coding devices) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (given frequency) , E LPO is an energy of an impulse response of the LP filter of a last non (said processing unit, predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6108626A
CLAIM 6
. A method as claimed in claim 1 , characterized in that the selection of the first and second algorithm is carried out in dependence of configuration information passed from a user equipment (US) to coding devices (signal energy, LP filter, LP filter excitation signal) (AC) and/or of control information passed from a transmission system (SY) to the coding devices .

US6108626A
CLAIM 11
. A method as claimed in claim 1 , characterized in that the selection of the frequency bands to be submitted to at least the first coding step , the selection of the bands for which also second coding steps are to be performed and the number of second coding steps for a given frequency (current frame) band are determined in dependency of the bandwidth and bit rate desired for the coded signal and on requirements of a user equipment (US) and of a system (SY) in which the coded signal is exploited , independently of the bandwidth and sampling frequency of the signal to be coded , on a frame per frame basis .

US6108626A
CLAIM 46
. An apparatus as claimed in claim 44 characterized in that said means (BCL) for the quality increase evaluation exploit information on a perceptual model provided by said processing unit (last non) (PMP) .

US6108626A
CLAIM 48
. An apparatus as claimed in claim 47 , characterized in that the combination means (BCU) are arranged to transmit the bit packets within a macro-object bit stream (OB11 . . . OB21) in an order of frequency band , starting with the lowest frequency (signal energy, LP filter, LP filter excitation signal) band .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (predetermined bandwidth) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6108626A
CLAIM 8
. A method as claimed in claim 1 , characterized in that said frequency bands have a predetermined bandwidth (pitch period) , independently of a sampling frequency of the signal to be coded .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (predetermined bandwidth) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6108626A
CLAIM 8
. A method as claimed in claim 1 , characterized in that said frequency bands have a predetermined bandwidth (pitch period) , independently of a sampling frequency of the signal to be coded .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (lowest frequency, coding devices) for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6108626A
CLAIM 6
. A method as claimed in claim 1 , characterized in that the selection of the first and second algorithm is carried out in dependence of configuration information passed from a user equipment (US) to coding devices (signal energy, LP filter, LP filter excitation signal) (AC) and/or of control information passed from a transmission system (SY) to the coding devices .

US6108626A
CLAIM 48
. An apparatus as claimed in claim 47 , characterized in that the combination means (BCU) are arranged to transmit the bit packets within a macro-object bit stream (OB11 . . . OB21) in an order of frequency band , starting with the lowest frequency (signal energy, LP filter, LP filter excitation signal) band .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (said processing unit, predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (other bands) and the first non erased frame received after frame erasure is encoded as active speech .
US6108626A
CLAIM 36
. An apparatus as claimed in claim 32 , characterized in that the first and second coding units (LCC , HCC , LEC , HEC) of each band are configurable independently of the coding units of the other bands (comfort noise) .

US6108626A
CLAIM 46
. An apparatus as claimed in claim 44 characterized in that said means (BCL) for the quality increase evaluation exploit information on a perceptual model provided by said processing unit (last non) (PMP) .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (lowest frequency, coding devices) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6108626A
CLAIM 6
. A method as claimed in claim 1 , characterized in that the selection of the first and second algorithm is carried out in dependence of configuration information passed from a user equipment (US) to coding devices (signal energy, LP filter, LP filter excitation signal) (AC) and/or of control information passed from a transmission system (SY) to the coding devices .

US6108626A
CLAIM 48
. An apparatus as claimed in claim 47 , characterized in that the combination means (BCU) are arranged to transmit the bit packets within a macro-object bit stream (OB11 . . . OB21) in an order of frequency band , starting with the lowest frequency (signal energy, LP filter, LP filter excitation signal) band .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (lowest frequency, coding devices) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (given frequency) , E LPO is an energy of an impulse response of a LP filter of a last non (said processing unit, predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6108626A
CLAIM 6
. A method as claimed in claim 1 , characterized in that the selection of the first and second algorithm is carried out in dependence of configuration information passed from a user equipment (US) to coding devices (signal energy, LP filter, LP filter excitation signal) (AC) and/or of control information passed from a transmission system (SY) to the coding devices .

US6108626A
CLAIM 11
. A method as claimed in claim 1 , characterized in that the selection of the frequency bands to be submitted to at least the first coding step , the selection of the bands for which also second coding steps are to be performed and the number of second coding steps for a given frequency (current frame) band are determined in dependency of the bandwidth and bit rate desired for the coded signal and on requirements of a user equipment (US) and of a system (SY) in which the coded signal is exploited , independently of the bandwidth and sampling frequency of the signal to be coded , on a frame per frame basis .

US6108626A
CLAIM 46
. An apparatus as claimed in claim 44 characterized in that said means (BCL) for the quality increase evaluation exploit information on a perceptual model provided by said processing unit (last non) (PMP) .

US6108626A
CLAIM 48
. An apparatus as claimed in claim 47 , characterized in that the combination means (BCU) are arranged to transmit the bit packets within a macro-object bit stream (OB11 . . . OB21) in an order of frequency band , starting with the lowest frequency (signal energy, LP filter, LP filter excitation signal) band .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (predetermined bandwidth) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6108626A
CLAIM 8
. A method as claimed in claim 1 , characterized in that said frequency bands have a predetermined bandwidth (pitch period) , independently of a sampling frequency of the signal to be coded .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (lowest frequency, coding devices) for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6108626A
CLAIM 6
. A method as claimed in claim 1 , characterized in that the selection of the first and second algorithm is carried out in dependence of configuration information passed from a user equipment (US) to coding devices (signal energy, LP filter, LP filter excitation signal) (AC) and/or of control information passed from a transmission system (SY) to the coding devices .

US6108626A
CLAIM 48
. An apparatus as claimed in claim 47 , characterized in that the combination means (BCU) are arranged to transmit the bit packets within a macro-object bit stream (OB11 . . . OB21) in an order of frequency band , starting with the lowest frequency (signal energy, LP filter, LP filter excitation signal) band .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (lowest frequency, coding devices) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (given frequency) , E LPO is an energy of an impulse response of a LP filter of a last non (said processing unit, predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6108626A
CLAIM 6
. A method as claimed in claim 1 , characterized in that the selection of the first and second algorithm is carried out in dependence of configuration information passed from a user equipment (US) to coding devices (signal energy, LP filter, LP filter excitation signal) (AC) and/or of control information passed from a transmission system (SY) to the coding devices .

US6108626A
CLAIM 11
. A method as claimed in claim 1 , characterized in that the selection of the frequency bands to be submitted to at least the first coding step , the selection of the bands for which also second coding steps are to be performed and the number of second coding steps for a given frequency (current frame) band are determined in dependency of the bandwidth and bit rate desired for the coded signal and on requirements of a user equipment (US) and of a system (SY) in which the coded signal is exploited , independently of the bandwidth and sampling frequency of the signal to be coded , on a frame per frame basis .

US6108626A
CLAIM 46
. An apparatus as claimed in claim 44 characterized in that said means (BCL) for the quality increase evaluation exploit information on a perceptual model provided by said processing unit (last non) (PMP) .

US6108626A
CLAIM 48
. An apparatus as claimed in claim 47 , characterized in that the combination means (BCU) are arranged to transmit the bit packets within a macro-object bit stream (OB11 . . . OB21) in an order of frequency band , starting with the lowest frequency (signal energy, LP filter, LP filter excitation signal) band .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
WO9953479A1

Filed: 1998-04-15     Issued: 1999-10-21

Fast frame optimisation in an audio encoder

(Original Assignee) Sgs-Thomson Microelectronics Asia Pacific (Pte) Ltd.     

Mohammed Javed Absar, Sapna George, Antonio Mario Alvarez-Tinoco
US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bitstream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bitstream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bitstream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (output bits) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (data blocks) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bits (first non, last non) tream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

WO9953479A1
CLAIM 2
. A method as claimed in claim 1 , wherein the frame is arranged in a plurality of data blocks (last frame) , each block having die plurality of variable size fields corresponding to different coded data types .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (output bits) erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types , the method including the steps of : receiving input data to be coded into a frame of the output bits (first non, last non) tream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (output bits) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (output bits) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types , the method including the steps of : receiving input data to be coded into a frame of the output bits (first non, last non) tream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (output bits) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (data blocks) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bits (first non, last non) tream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

WO9953479A1
CLAIM 2
. A method as claimed in claim 1 , wherein the frame is arranged in a plurality of data blocks (last frame) , each block having die plurality of variable size fields corresponding to different coded data types .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non (output bits) erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (output bits) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types , the method including the steps of : receiving input data to be coded into a frame of the output bits (first non, last non) tream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bitstream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bitstream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (different types) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (output bits) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (data blocks) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (output bits) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bits (first non, last non) tream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

WO9953479A1
CLAIM 2
. A method as claimed in claim 1 , wherein the frame is arranged in a plurality of data blocks (last frame) , each block having die plurality of variable size fields corresponding to different coded data types .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bitstream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bitstream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bitstream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (output bits) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (data blocks) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bits (first non, last non) tream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

WO9953479A1
CLAIM 2
. A method as claimed in claim 1 , wherein the frame is arranged in a plurality of data blocks (last frame) , each block having die plurality of variable size fields corresponding to different coded data types .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (output bits) erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types , the method including the steps of : receiving input data to be coded into a frame of the output bits (first non, last non) tream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (output bits) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (output bits) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types , the method including the steps of : receiving input data to be coded into a frame of the output bits (first non, last non) tream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non (output bits) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (data blocks) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bits (first non, last non) tream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

WO9953479A1
CLAIM 2
. A method as claimed in claim 1 , wherein the frame is arranged in a plurality of data blocks (last frame) , each block having die plurality of variable size fields corresponding to different coded data types .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non (output bits) erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (output bits) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types , the method including the steps of : receiving input data to be coded into a frame of the output bits (first non, last non) tream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bitstream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bitstream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bitstream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (different types) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non (output bits) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (data blocks) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (output bits) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
WO9953479A1
CLAIM 1
. A method of processing input audio data for compression into an encoded bitstream comprising a series of fixed size frames , each of die fixed size frames having a plurality of variable size fields containing coded data of different types (signal classification parameter) , the method including the steps of : receiving input data to be coded into a frame of the output bits (first non, last non) tream ;
preprocessing the input data to ╬▒etermine at least one first coding parameter to be used for coding die input data into at least one of the variable size fields in die frame , wherein die value of the at least one first coding parameter affects the data space size required for the at least one variable size field ;
storing the at least one first coding parameter determined in the preprocessing step ;
allocating data space in die frame for at least one other of the variable size fields on the basis of die determined at least one first coding parameter ;
determining at least one second coding parameter for coding data into the at least one other variable sized field on die basis of said allocated space ;
and coding the input data into die variable sized fields of die frame using die first and second coding parameters .

WO9953479A1
CLAIM 2
. A method as claimed in claim 1 , wherein the frame is arranged in a plurality of data blocks (last frame) , each block having die plurality of variable size fields corresponding to different coded data types .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6208962B1

Filed: 1998-04-02     Issued: 2001-03-27

Signal coding system

(Original Assignee) NEC Corp     (Current Assignee) NEC Corp

Kazunori Ozawa
US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6208962B1
CLAIM 2
. A signal coding system as set forth in claim 1 , further comprising : a level calculator which divides said orthogonal transformation coefficient derived by said orthogonal transformation circuit by a predetermined number (last non) and determines an average level for a plurality of said orthogonal transformation coefficients after said orthogonal transformation coefficients are divided by said predetermined number , and wherein said coefficient calculating circuit expresses an envelop of said average level derived by said level calculator using said plurality of calculated coefficients .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6208962B1
CLAIM 2
. A signal coding system as set forth in claim 1 , further comprising : a level calculator which divides said orthogonal transformation coefficient derived by said orthogonal transformation circuit by a predetermined number (last non) and determines an average level for a plurality of said orthogonal transformation coefficients after said orthogonal transformation coefficients are divided by said predetermined number , and wherein said coefficient calculating circuit expresses an envelop of said average level derived by said level calculator using said plurality of calculated coefficients .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6208962B1
CLAIM 2
. A signal coding system as set forth in claim 1 , further comprising : a level calculator which divides said orthogonal transformation coefficient derived by said orthogonal transformation circuit by a predetermined number (last non) and determines an average level for a plurality of said orthogonal transformation coefficients after said orthogonal transformation coefficients are divided by said predetermined number , and wherein said coefficient calculating circuit expresses an envelop of said average level derived by said level calculator using said plurality of calculated coefficients .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6208962B1
CLAIM 2
. A signal coding system as set forth in claim 1 , further comprising : a level calculator which divides said orthogonal transformation coefficient derived by said orthogonal transformation circuit by a predetermined number (last non) and determines an average level for a plurality of said orthogonal transformation coefficients after said orthogonal transformation coefficients are divided by said predetermined number , and wherein said coefficient calculating circuit expresses an envelop of said average level derived by said level calculator using said plurality of calculated coefficients .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6208962B1
CLAIM 2
. A signal coding system as set forth in claim 1 , further comprising : a level calculator which divides said orthogonal transformation coefficient derived by said orthogonal transformation circuit by a predetermined number (last non) and determines an average level for a plurality of said orthogonal transformation coefficients after said orthogonal transformation coefficients are divided by said predetermined number , and wherein said coefficient calculating circuit expresses an envelop of said average level derived by said level calculator using said plurality of calculated coefficients .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (residual error) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6208962B1
CLAIM 1
. A signal coding system for coding an input signal , said signal coding system comprising : a spectral parameter calculating circuit which derives a spectral parameter of said input signal ;
a predicting circuit which derives a predictive residual error (frame concealment, decoder determines concealment) based upon a result of a prediction of said input signal ;
an orthogonal transformation circuit which derives an orthogonal transformation coefficient by performing an orthogonal transformation on said predictive residual error ;
a coefficient calculating circuit which expresses an envelop of a plurality of said orthogonal transformation coefficients as a plurality of calculated coefficients ;
and a quantizer which quantizes said orthogonal transformation coefficients by expressing said orthogonal transformation coefficients as a plurality of pulses thereby producing a quantization result , said quantizer further outputs a combination of said spectral parameter , said calculated coefficients and said quantization result .

US6208962B1
CLAIM 2
. A signal coding system as set forth in claim 1 , further comprising : a level calculator which divides said orthogonal transformation coefficient derived by said orthogonal transformation circuit by a predetermined number (last non) and determines an average level for a plurality of said orthogonal transformation coefficients after said orthogonal transformation coefficients are divided by said predetermined number , and wherein said coefficient calculating circuit expresses an envelop of said average level derived by said level calculator using said plurality of calculated coefficients .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6236961B1

Filed: 1998-03-23     Issued: 2001-05-22

Speech signal coder

(Original Assignee) NEC Corp     (Current Assignee) NEC Corp

Kazunori Ozawa
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse responses) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse responses) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6236961B1
CLAIM 1
. A speech signal coder for coding a speech signal , the speech signal coder comprising : a parameter calculator which calculates spectral and pitch parameters from the speech signal thereby producing calculated parameters , and quantizes the calculated parameters thereby producing quantized spectral and pitch parameters ;
an impulse response calculator having a filter , the impulse response calculator calculates impulse responses (impulse responses, impulse response) of the quantized spectral and pitch parameters by using the filter ;
a first orthogonal transform circuit which produces a first transform signal by performing an orthogonal transform of the speech signal using inverse filtering in accordance with the quantized spectral and pitch parameters ;
a second orthogonal transform circuit which transforms the impulse responses to produce a second transform signal ;
and a pulse quantizer which quantizes the first transform signal using the second transform signal .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (inverse filtering) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6236961B1
CLAIM 1
. A speech signal coder for coding a speech signal , the speech signal coder comprising : a parameter calculator which calculates spectral and pitch parameters from the speech signal thereby producing calculated parameters , and quantizes the calculated parameters thereby producing quantized spectral and pitch parameters ;
an impulse response calculator having a filter , the impulse response calculator calculates impulse responses of the quantized spectral and pitch parameters by using the filter ;
a first orthogonal transform circuit which produces a first transform signal by performing an orthogonal transform of the speech signal using inverse filtering (LP filter, LP filter excitation signal, decoder determines concealment) in accordance with the quantized spectral and pitch parameters ;
a second orthogonal transform circuit which transforms the impulse responses to produce a second transform signal ;
and a pulse quantizer which quantizes the first transform signal using the second transform signal .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (inverse filtering) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse responses) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6236961B1
CLAIM 1
. A speech signal coder for coding a speech signal , the speech signal coder comprising : a parameter calculator which calculates spectral and pitch p (E q) arameters from the speech signal thereby producing calculated parameters , and quantizes the calculated parameters thereby producing quantized spectral and pitch parameters ;
an impulse response calculator having a filter , the impulse response calculator calculates impulse responses (impulse responses, impulse response) of the quantized spectral and pitch parameters by using the filter ;
a first orthogonal transform circuit which produces a first transform signal by performing an orthogonal transform of the speech signal using inverse filtering (LP filter, LP filter excitation signal, decoder determines concealment) in accordance with the quantized spectral and pitch parameters ;
a second orthogonal transform circuit which transforms the impulse responses to produce a second transform signal ;
and a pulse quantizer which quantizes the first transform signal using the second transform signal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (inverse filtering) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse responses) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6236961B1
CLAIM 1
. A speech signal coder for coding a speech signal , the speech signal coder comprising : a parameter calculator which calculates spectral and pitch p (E q) arameters from the speech signal thereby producing calculated parameters , and quantizes the calculated parameters thereby producing quantized spectral and pitch parameters ;
an impulse response calculator having a filter , the impulse response calculator calculates impulse responses (impulse responses, impulse response) of the quantized spectral and pitch parameters by using the filter ;
a first orthogonal transform circuit which produces a first transform signal by performing an orthogonal transform of the speech signal using inverse filtering (LP filter, LP filter excitation signal, decoder determines concealment) in accordance with the quantized spectral and pitch parameters ;
a second orthogonal transform circuit which transforms the impulse responses to produce a second transform signal ;
and a pulse quantizer which quantizes the first transform signal using the second transform signal .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse responses) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse responses) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6236961B1
CLAIM 1
. A speech signal coder for coding a speech signal , the speech signal coder comprising : a parameter calculator which calculates spectral and pitch parameters from the speech signal thereby producing calculated parameters , and quantizes the calculated parameters thereby producing quantized spectral and pitch parameters ;
an impulse response calculator having a filter , the impulse response calculator calculates impulse responses (impulse responses, impulse response) of the quantized spectral and pitch parameters by using the filter ;
a first orthogonal transform circuit which produces a first transform signal by performing an orthogonal transform of the speech signal using inverse filtering in accordance with the quantized spectral and pitch parameters ;
a second orthogonal transform circuit which transforms the impulse responses to produce a second transform signal ;
and a pulse quantizer which quantizes the first transform signal using the second transform signal .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (inverse filtering) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6236961B1
CLAIM 1
. A speech signal coder for coding a speech signal , the speech signal coder comprising : a parameter calculator which calculates spectral and pitch parameters from the speech signal thereby producing calculated parameters , and quantizes the calculated parameters thereby producing quantized spectral and pitch parameters ;
an impulse response calculator having a filter , the impulse response calculator calculates impulse responses of the quantized spectral and pitch parameters by using the filter ;
a first orthogonal transform circuit which produces a first transform signal by performing an orthogonal transform of the speech signal using inverse filtering (LP filter, LP filter excitation signal, decoder determines concealment) in accordance with the quantized spectral and pitch parameters ;
a second orthogonal transform circuit which transforms the impulse responses to produce a second transform signal ;
and a pulse quantizer which quantizes the first transform signal using the second transform signal .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (inverse filtering) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse responses) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6236961B1
CLAIM 1
. A speech signal coder for coding a speech signal , the speech signal coder comprising : a parameter calculator which calculates spectral and pitch p (E q) arameters from the speech signal thereby producing calculated parameters , and quantizes the calculated parameters thereby producing quantized spectral and pitch parameters ;
an impulse response calculator having a filter , the impulse response calculator calculates impulse responses (impulse responses, impulse response) of the quantized spectral and pitch parameters by using the filter ;
a first orthogonal transform circuit which produces a first transform signal by performing an orthogonal transform of the speech signal using inverse filtering (LP filter, LP filter excitation signal, decoder determines concealment) in accordance with the quantized spectral and pitch parameters ;
a second orthogonal transform circuit which transforms the impulse responses to produce a second transform signal ;
and a pulse quantizer which quantizes the first transform signal using the second transform signal .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (inverse filtering) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse responses) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6236961B1
CLAIM 1
. A speech signal coder for coding a speech signal , the speech signal coder comprising : a parameter calculator which calculates spectral and pitch p (E q) arameters from the speech signal thereby producing calculated parameters , and quantizes the calculated parameters thereby producing quantized spectral and pitch parameters ;
an impulse response calculator having a filter , the impulse response calculator calculates impulse responses (impulse responses, impulse response) of the quantized spectral and pitch parameters by using the filter ;
a first orthogonal transform circuit which produces a first transform signal by performing an orthogonal transform of the speech signal using inverse filtering (LP filter, LP filter excitation signal, decoder determines concealment) in accordance with the quantized spectral and pitch parameters ;
a second orthogonal transform circuit which transforms the impulse responses to produce a second transform signal ;
and a pulse quantizer which quantizes the first transform signal using the second transform signal .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
GB2324689A

Filed: 1998-03-16     Issued: 1998-10-28

Dual subframe quantisation of spectral magnitudes

(Original Assignee) Digital Voice Systems Inc     (Current Assignee) Digital Voice Systems Inc

John Clark Hardwick
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoded bits) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe (last subframe) affected by the artificial construction of the periodic part .
GB2324689A
CLAIM 1
S . The method of claim 1 or 2 , wherein the predicted spectral magnitude parameters are formed by applying a gain of less than unity to a linear interpolation of the quantized spectral magnitudes from the last subframe (last subframe) in the previous block .

GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoded bits) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoded bits) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoded bits) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (Discrete Cosine, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
GB2324689A
CLAIM 1
. A method of encoding speech into a 90 millisecond frame of bits for transmission across a satellite communication channel , the method comprising the steps of : digitizing a speech signal (decoder determines concealment, speech signal) into a sequence of digital speech samples ;
dividing the digital speech samples into a sequence of subframes , each of the subframes comprising a plurality of the digital speech samples ;
estimating a set of model parameters for each of the subframes ;
wherein the model parameters comprise a set of spectral magnitude parameters that represent spectral information for the subframe ;
combining two consecutive subframes from the sequence of subframes into a block ;
jointly quantizing the spectral magnitude parameters from both of the subframes within the block , wherein the joint quantization includes forming predicted spectral magnitude parameters from the quantized spectral magnitude parameters from a previous block , computing residual parameters as the difference between the spectral magnitude parameters and the predicted spectral magnitude parameters , combining the residual parameters from both of the subframes within the block , and using a plurality of vector quantizers to quantize the combined residual parameters into a set of encoded spectral bits ;
adding redundant error control bits to the encoded spectral bits from each block to protect at least some of the encoded spectral bits within the block from bit errors ;
and combining the added redundant error control bits and encoded spectral bits from two consecutive blocks into a 90 millisecond frame of bits for transmission across a satellite communication channel .

GB2324689A
CLAIM 8
. The method of claim 2 wherein the transformed residual coefficients are computed for each of the frequency blocks using a Discrete Cosine (decoder determines concealment, speech signal) Transform (DCT) followed by a linear 2 by 2 transform on the two lowest order DCT coefficients .

GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoded bits) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (Discrete Cosine, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (decoded bits) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
GB2324689A
CLAIM 1
. A method of encoding speech into a 90 millisecond frame of bits for transmission across a satellite communication channel , the method comprising the steps of : digitizing a speech signal (decoder determines concealment, speech signal) into a sequence of digital speech samples ;
dividing the digital speech samples into a sequence of subframes , each of the subframes comprising a plurality of the digital speech samples ;
estimating a set of model parameters for each of the subframes ;
wherein the model parameters comprise a set of spectral magnitude parameters that represent spectral information for the subframe ;
combining two consecutive subframes from the sequence of subframes into a block ;
jointly quantizing the spectral magnitude parameters from both of the subframes within the block , wherein the joint quantization includes forming predicted spectral magnitude parameters from the quantized spectral magnitude parameters from a previous block , computing residual parameters as the difference between the spectral magnitude parameters and the predicted spectral magnitude parameters , combining the residual parameters from both of the subframes within the block , and using a plurality of vector quantizers to quantize the combined residual parameters into a set of encoded spectral bits ;
adding redundant error control bits to the encoded spectral bits from each block to protect at least some of the encoded spectral bits within the block from bit errors ;
and combining the added redundant error control bits and encoded spectral bits from two consecutive blocks into a 90 millisecond frame of bits for transmission across a satellite communication channel .

GB2324689A
CLAIM 8
. The method of claim 2 wherein the transformed residual coefficients are computed for each of the frequency blocks using a Discrete Cosine (decoder determines concealment, speech signal) Transform (DCT) followed by a linear 2 by 2 transform on the two lowest order DCT coefficients .

GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (Discrete Cosine, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
GB2324689A
CLAIM 1
. A method of encoding speech into a 90 millisecond frame of bits for transmission across a satellite communication channel , the method comprising the steps of : digitizing a speech signal (decoder determines concealment, speech signal) into a sequence of digital speech samples ;
dividing the digital speech samples into a sequence of subframes , each of the subframes comprising a plurality of the digital speech samples ;
estimating a set of model parameters for each of the subframes ;
wherein the model parameters comprise a set of spectral magnitude parameters that represent spectral information for the subframe ;
combining two consecutive subframes from the sequence of subframes into a block ;
jointly quantizing the spectral magnitude parameters from both of the subframes within the block , wherein the joint quantization includes forming predicted spectral magnitude parameters from the quantized spectral magnitude parameters from a previous block , computing residual parameters as the difference between the spectral magnitude parameters and the predicted spectral magnitude parameters , combining the residual parameters from both of the subframes within the block , and using a plurality of vector quantizers to quantize the combined residual parameters into a set of encoded spectral bits ;
adding redundant error control bits to the encoded spectral bits from each block to protect at least some of the encoded spectral bits within the block from bit errors ;
and combining the added redundant error control bits and encoded spectral bits from two consecutive blocks into a 90 millisecond frame of bits for transmission across a satellite communication channel .

GB2324689A
CLAIM 8
. The method of claim 2 wherein the transformed residual coefficients are computed for each of the frequency blocks using a Discrete Cosine (decoder determines concealment, speech signal) Transform (DCT) followed by a linear 2 by 2 transform on the two lowest order DCT coefficients .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoded bits) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoded bits) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (decoded bits) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe (last subframe) affected by the artificial construction of the periodic part .
GB2324689A
CLAIM 1
S . The method of claim 1 or 2 , wherein the predicted spectral magnitude parameters are formed by applying a gain of less than unity to a linear interpolation of the quantized spectral magnitudes from the last subframe (last subframe) in the previous block .

GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoded bits) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoded bits) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoded bits) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (Discrete Cosine, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
GB2324689A
CLAIM 1
. A method of encoding speech into a 90 millisecond frame of bits for transmission across a satellite communication channel , the method comprising the steps of : digitizing a speech signal (decoder determines concealment, speech signal) into a sequence of digital speech samples ;
dividing the digital speech samples into a sequence of subframes , each of the subframes comprising a plurality of the digital speech samples ;
estimating a set of model parameters for each of the subframes ;
wherein the model parameters comprise a set of spectral magnitude parameters that represent spectral information for the subframe ;
combining two consecutive subframes from the sequence of subframes into a block ;
jointly quantizing the spectral magnitude parameters from both of the subframes within the block , wherein the joint quantization includes forming predicted spectral magnitude parameters from the quantized spectral magnitude parameters from a previous block , computing residual parameters as the difference between the spectral magnitude parameters and the predicted spectral magnitude parameters , combining the residual parameters from both of the subframes within the block , and using a plurality of vector quantizers to quantize the combined residual parameters into a set of encoded spectral bits ;
adding redundant error control bits to the encoded spectral bits from each block to protect at least some of the encoded spectral bits within the block from bit errors ;
and combining the added redundant error control bits and encoded spectral bits from two consecutive blocks into a 90 millisecond frame of bits for transmission across a satellite communication channel .

GB2324689A
CLAIM 8
. The method of claim 2 wherein the transformed residual coefficients are computed for each of the frequency blocks using a Discrete Cosine (decoder determines concealment, speech signal) Transform (DCT) followed by a linear 2 by 2 transform on the two lowest order DCT coefficients .

GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoded bits) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (Discrete Cosine, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (decoded bits) , limits to a given value a gain used for scaling the synthesized sound signal .
GB2324689A
CLAIM 1
. A method of encoding speech into a 90 millisecond frame of bits for transmission across a satellite communication channel , the method comprising the steps of : digitizing a speech signal (decoder determines concealment, speech signal) into a sequence of digital speech samples ;
dividing the digital speech samples into a sequence of subframes , each of the subframes comprising a plurality of the digital speech samples ;
estimating a set of model parameters for each of the subframes ;
wherein the model parameters comprise a set of spectral magnitude parameters that represent spectral information for the subframe ;
combining two consecutive subframes from the sequence of subframes into a block ;
jointly quantizing the spectral magnitude parameters from both of the subframes within the block , wherein the joint quantization includes forming predicted spectral magnitude parameters from the quantized spectral magnitude parameters from a previous block , computing residual parameters as the difference between the spectral magnitude parameters and the predicted spectral magnitude parameters , combining the residual parameters from both of the subframes within the block , and using a plurality of vector quantizers to quantize the combined residual parameters into a set of encoded spectral bits ;
adding redundant error control bits to the encoded spectral bits from each block to protect at least some of the encoded spectral bits within the block from bit errors ;
and combining the added redundant error control bits and encoded spectral bits from two consecutive blocks into a 90 millisecond frame of bits for transmission across a satellite communication channel .

GB2324689A
CLAIM 8
. The method of claim 2 wherein the transformed residual coefficients are computed for each of the frequency blocks using a Discrete Cosine (decoder determines concealment, speech signal) Transform (DCT) followed by a linear 2 by 2 transform on the two lowest order DCT coefficients .

GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (Discrete Cosine, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
GB2324689A
CLAIM 1
. A method of encoding speech into a 90 millisecond frame of bits for transmission across a satellite communication channel , the method comprising the steps of : digitizing a speech signal (decoder determines concealment, speech signal) into a sequence of digital speech samples ;
dividing the digital speech samples into a sequence of subframes , each of the subframes comprising a plurality of the digital speech samples ;
estimating a set of model parameters for each of the subframes ;
wherein the model parameters comprise a set of spectral magnitude parameters that represent spectral information for the subframe ;
combining two consecutive subframes from the sequence of subframes into a block ;
jointly quantizing the spectral magnitude parameters from both of the subframes within the block , wherein the joint quantization includes forming predicted spectral magnitude parameters from the quantized spectral magnitude parameters from a previous block , computing residual parameters as the difference between the spectral magnitude parameters and the predicted spectral magnitude parameters , combining the residual parameters from both of the subframes within the block , and using a plurality of vector quantizers to quantize the combined residual parameters into a set of encoded spectral bits ;
adding redundant error control bits to the encoded spectral bits from each block to protect at least some of the encoded spectral bits within the block from bit errors ;
and combining the added redundant error control bits and encoded spectral bits from two consecutive blocks into a 90 millisecond frame of bits for transmission across a satellite communication channel .

GB2324689A
CLAIM 8
. The method of claim 2 wherein the transformed residual coefficients are computed for each of the frequency blocks using a Discrete Cosine (decoder determines concealment, speech signal) Transform (DCT) followed by a linear 2 by 2 transform on the two lowest order DCT coefficients .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoded bits) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (Discrete Cosine, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
GB2324689A
CLAIM 1
. A method of encoding speech into a 90 millisecond frame of bits for transmission across a satellite communication channel , the method comprising the steps of : digitizing a speech signal (decoder determines concealment, speech signal) into a sequence of digital speech samples ;
dividing the digital speech samples into a sequence of subframes , each of the subframes comprising a plurality of the digital speech samples ;
estimating a set of model parameters for each of the subframes ;
wherein the model parameters comprise a set of spectral magnitude parameters that represent spectral information for the subframe ;
combining two consecutive subframes from the sequence of subframes into a block ;
jointly quantizing the spectral magnitude parameters from both of the subframes within the block , wherein the joint quantization includes forming predicted spectral magnitude parameters from the quantized spectral magnitude parameters from a previous block , computing residual parameters as the difference between the spectral magnitude parameters and the predicted spectral magnitude parameters , combining the residual parameters from both of the subframes within the block , and using a plurality of vector quantizers to quantize the combined residual parameters into a set of encoded spectral bits ;
adding redundant error control bits to the encoded spectral bits from each block to protect at least some of the encoded spectral bits within the block from bit errors ;
and combining the added redundant error control bits and encoded spectral bits from two consecutive blocks into a 90 millisecond frame of bits for transmission across a satellite communication channel .

GB2324689A
CLAIM 8
. The method of claim 2 wherein the transformed residual coefficients are computed for each of the frequency blocks using a Discrete Cosine (decoder determines concealment, speech signal) Transform (DCT) followed by a linear 2 by 2 transform on the two lowest order DCT coefficients .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery (decoded bits) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
GB2324689A
CLAIM 14
. A method of decoding speech from a 90 millisecond frame of bits received across a satellite communication channel , the method comprising the steps of : dividing the frame of bits into two blocks of bits , wherein each block of bits represents two subframes of speech ;
applying error control decoding to each block of bits using redundant error control bits included within the block to produce error decoded bits (decoder recovery) which are at least in part protected from bit errors ;
using the error decoded bits to jointly reconstruct spectral magnitude parameters for both of the subframes within a block , wherein the joint reconstruction includes using a plurality of vector quantizer codebooks to reconstruct a set of combined residual parameters from which separate residual parameters for both of the subframes are computed , forming predicted spectral magnitude parameters from the reconstructed spectral magnitude parameters from a previous block , and adding the separate residual parameters to the predicted spectral magnitude parameters to form the reconstructed spectral magnitude parameters for each subframe within the block ;
and synthesizing a plurality of digital speech samples for each subframe using the reconstructed spectral magnitude parameters for the subframe .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6167375A

Filed: 1998-03-16     Issued: 2000-12-26

Method for encoding and decoding a speech signal including background noise

(Original Assignee) Toshiba Corp     (Current Assignee) Toshiba Corp

Kimio Miseki, Masahiro Oshikiri, Tadashi Amada, Masami Akamine
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (decoding apparatus) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E (first decode) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6167375A
CLAIM 24
. A speech decoding apparatus comprising : a component separator configured to separate from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
a first decode (⁢ E) r configured to decode said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
a second decoder configured to decode the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and a mixer configured to add the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E (first decode) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US6167375A
CLAIM 24
. A speech decoding apparatus comprising : a component separator configured to separate from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
a first decode (⁢ E) r configured to decode said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
a second decoder configured to decode the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and a mixer configured to add the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs (decoding apparatus) , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (decoding apparatus) , limits to a given value a gain used for scaling the synthesized sound signal .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E (first decode) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6167375A
CLAIM 24
. A speech decoding apparatus comprising : a component separator configured to separate from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
a first decode (⁢ E) r configured to decode said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
a second decoder configured to decode the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and a mixer configured to add the reproduced first and second components to generate a final output speech signal .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery (decoding apparatus) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E (first decode) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6167375A
CLAIM 22
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for separating from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
means for decoding said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
means for decoding the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and means for adding the reproduced first and second components to generate a final output speech signal .

US6167375A
CLAIM 24
. A speech decoding apparatus comprising : a component separator configured to separate from transmitted input data information on bit allocation regarding each of first and second encoded data of first and second components , the first encoded data of the first component , and the second encoded data of the second component , wherein the first component is mainly constituted by a speech signal and the second component is mainly constituted by a background noise signal which varies in spectrum more slowly than that of the speech signal ;
a first decode (⁢ E) r configured to decode said information on bit allocation to obtain bit allocation regarding the first and second encoded data of said first and second components ;
a second decoder configured to decode the first and second encoded data of said first and second components in accordance with the bit allocation to reproduce said first and second components and to obtain reproduced first and second components ;
and a mixer configured to add the reproduced first and second components to generate a final output speech signal .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6064954A

Filed: 1998-03-04     Issued: 2000-05-16

Digital audio signal coding

(Original Assignee) International Business Machines Corp     (Current Assignee) Cisco Technology Inc

Gilad Cohen, Yossef Cohen, Doron Hoffman, Hagai Krupnik, Aharon Satt
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (signal samples) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6064954A
CLAIM 11
. Apparatus as claimed in claim 1 wherein the input signal comprises a set of signal samples (impulse responses) arranged in frames and wherein the apparatus is arranged to enable or disable the subtraction of the prediction signal from the input signal according to an estimation of the likely coding gain to be derived therefrom and wherein the output signal includes an indication for each frame as to whether the prediction signal has been subtracted from the input signal .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6064954A
CLAIM 3
. Apparatus as claimed in claim 2 wherein the quantizer comprises means for calculating a masking threshold sequence that represents a (LP filter excitation signal) n amplitude bound for quantization noise in the frequency domain and means to divide frequency domain coefficients of the error signal by the masking threshold sequence to obtain normalized coefficients , and wherein the output signal includes information defining the masking threshold sequence .

US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6064954A
CLAIM 3
. Apparatus as claimed in claim 2 wherein the quantizer comprises means for calculating a masking threshold sequence that represents a (LP filter excitation signal) n amplitude bound for quantization noise in the frequency domain and means to divide frequency domain coefficients of the error signal by the masking threshold sequence to obtain normalized coefficients , and wherein the output signal includes information defining the masking threshold sequence .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6064954A
CLAIM 3
. Apparatus as claimed in claim 2 wherein the quantizer comprises means for calculating a masking threshold sequence that represents a (LP filter excitation signal) n amplitude bound for quantization noise in the frequency domain and means to divide frequency domain coefficients of the error signal by the masking threshold sequence to obtain normalized coefficients , and wherein the output signal includes information defining the masking threshold sequence .

US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (signal samples) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6064954A
CLAIM 11
. Apparatus as claimed in claim 1 wherein the input signal comprises a set of signal samples (impulse responses) arranged in frames and wherein the apparatus is arranged to enable or disable the subtraction of the prediction signal from the input signal according to an estimation of the likely coding gain to be derived therefrom and wherein the output signal includes an indication for each frame as to whether the prediction signal has been subtracted from the input signal .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6064954A
CLAIM 3
. Apparatus as claimed in claim 2 wherein the quantizer comprises means for calculating a masking threshold sequence that represents a (LP filter excitation signal) n amplitude bound for quantization noise in the frequency domain and means to divide frequency domain coefficients of the error signal by the masking threshold sequence to obtain normalized coefficients , and wherein the output signal includes information defining the masking threshold sequence .

US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6064954A
CLAIM 3
. Apparatus as claimed in claim 2 wherein the quantizer comprises means for calculating a masking threshold sequence that represents a (LP filter excitation signal) n amplitude bound for quantization noise in the frequency domain and means to divide frequency domain coefficients of the error signal by the masking threshold sequence to obtain normalized coefficients , and wherein the output signal includes information defining the masking threshold sequence .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6064954A
CLAIM 3
. Apparatus as claimed in claim 2 wherein the quantizer comprises means for calculating a masking threshold sequence that represents a (LP filter excitation signal) n amplitude bound for quantization noise in the frequency domain and means to divide frequency domain coefficients of the error signal by the masking threshold sequence to obtain normalized coefficients , and wherein the output signal includes information defining the masking threshold sequence .

US6064954A
CLAIM 16
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claimed in claim 14 and stored on a physical medium .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6134518A

Filed: 1998-03-04     Issued: 2000-10-17

Digital audio signal coding using a CELP coder and a transform coder

(Original Assignee) International Business Machines Corp     (Current Assignee) Cisco Technology Inc

Gilad Cohen, Yossef Cohen, Doron Hoffman, Hagai Krupnik, Aharon Satt
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (signal samples) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6134518A
CLAIM 1
. Apparatus for digitally encoding an input audio signal for storage or transmission wherein the input audio signal comprises a series of signal samples (impulse responses) ordered in time and divided into frames , comprising : logic for measuring a distinguishing parameter from the input signal , determining means for determining from the measured distinguishing parameter whether the input signal contains an audio signal of a first type or a second type ;
first and second coders for digitally encoding the input signal using first and second coding methods respectively ;
a switching arrangement for , at any particular time , directing the generation of an output signal by encoding the input signal using either the first or second coders according to whether the input signal contains an audio signal of the first type or the second type at that time ;
and wherein the first coder is a Codebook Excited Linear Predictive (CELP) coder and the second coder is a transform coder , each coder being arranged to operate on a frame-by-frame basis , the transform coder being arranged to encode a frame using a discrete frequency domain transform of a range of samples from a plurality of neighboring frames , and wherein the CELP coder is arranged to encode an extended frame to generate the last CELP encoded data prior to a switch from a mode of operation in which frames are encoded using the transform coder , the extended frame covers the same range of sample as the transform coder , so that a transform decoder can generate the information required to decode the first frame encoded using the transform coder from the last CELP encoded frame .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (audio data) per sample for other frames .
US6134518A
CLAIM 6
. Apparatus for digitally decoding an input signal comprising coded data for a series of frames of audio data (average energy) , comprising : logic to detect an indication in the coded data stream for each frame as to whether the frame has been encoded using a first coder or a second coder ;
first and second decoders for digitally decoding the input signal using first and second decoding methods respectively ;
a switching arrangement , for each frame , directing the generation of an output signal by decoding the input signal using either the first or second decoders according to the detected indication ;
and wherein the first decoder is a CELP decoder and the second decoder is a transform decoder and when switching from the mode of operation of decoding CELP encoded frames to transform encoded frames , the transform coder uses the information in an extended CELP frame when decoding the first frame encoded using the transform coder .

US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US6134518A
CLAIM 20
. A program storage device readable by machine , tangibly embodying a program of instructions executable by the machine to perform method steps for causing a digitally encoding of an input audio signal for storage or transmission wherein the input audio signal comprises a series of signal samples ordered in time and divided into frames , said method steps c (LP filter) omprising : measuring a distinguishing parameter from the input signal , determining from the measured distinguishing parameter whether the input signal contains an audio signal of a first type or a second type ;
and generating an output signal by encoding the input signal using either first or second coding methods according to whether the input signal contains an audio signal of the first type or the second type at that time , wherein the first coding method is CELP coding and the second coding method is transform coding , and wherein the input signal is coded on a frame-by-frame basis , the transform coding comprising encoding a frame using a discrete frequency domain transform of a range of samples from a plurality of neighboring frames , and wherein the CELP coding comprises generating the last CELP encoded frame prior to a switch from a mode of operation in which frames are encoded using the CELP coding to a mode of operation in which frames are encoded using transform coding by encoding an extended frame , the extended frame covering the same range of samples as the transform coding , so that a transform decoder can generate the information required to decode the first frame encoded using the transform coding from the last CELP encoded frame .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (steps c) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E (first decode) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6134518A
CLAIM 6
. Apparatus for digitally decoding an input signal comprising coded data for a series of frames of audio data , comprising : logic to detect an indication in the coded data stream for each frame as to whether the frame has been encoded using a first coder or a second coder ;
first and second decoders for digitally decoding the input signal using first and second decoding methods respectively ;
a switching arrangement , for each frame , directing the generation of an output signal by decoding the input signal using either the first or second decoders according to the detected indication ;
and wherein the first decode (⁢ E) r is a CELP decoder and the second decoder is a transform decoder and when switching from the mode of operation of decoding CELP encoded frames to transform encoded frames , the transform coder uses the information in an extended CELP frame when decoding the first frame encoded using the transform coder .

US6134518A
CLAIM 20
. A program storage device readable by machine , tangibly embodying a program of instructions executable by the machine to perform method steps for causing a digitally encoding of an input audio signal for storage or transmission wherein the input audio signal comprises a series of signal samples ordered in time and divided into frames , said method steps c (LP filter) omprising : measuring a distinguishing parameter from the input signal , determining from the measured distinguishing parameter whether the input signal contains an audio signal of a first type or a second type ;
and generating an output signal by encoding the input signal using either first or second coding methods according to whether the input signal contains an audio signal of the first type or the second type at that time , wherein the first coding method is CELP coding and the second coding method is transform coding , and wherein the input signal is coded on a frame-by-frame basis , the transform coding comprising encoding a frame using a discrete frequency domain transform of a range of samples from a plurality of neighboring frames , and wherein the CELP coding comprises generating the last CELP encoded frame prior to a switch from a mode of operation in which frames are encoded using the CELP coding to a mode of operation in which frames are encoded using transform coding by encoding an extended frame , the extended frame covering the same range of samples as the transform coding , so that a transform decoder can generate the information required to decode the first frame encoded using the transform coding from the last CELP encoded frame .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E (first decode) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6134518A
CLAIM 6
. Apparatus for digitally decoding an input signal comprising coded data for a series of frames of audio data , comprising : logic to detect an indication in the coded data stream for each frame as to whether the frame has been encoded using a first coder or a second coder ;
first and second decoders for digitally decoding the input signal using first and second decoding methods respectively ;
a switching arrangement , for each frame , directing the generation of an output signal by decoding the input signal using either the first or second decoders according to the detected indication ;
and wherein the first decode (⁢ E) r is a CELP decoder and the second decoder is a transform decoder and when switching from the mode of operation of decoding CELP encoded frames to transform encoded frames , the transform coder uses the information in an extended CELP frame when decoding the first frame encoded using the transform coder .

US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US6134518A
CLAIM 20
. A program storage device readable by machine , tangibly embodying a program of instructions executable by the machine to perform method steps for causing a digitally encoding of an input audio signal for storage or transmission wherein the input audio signal comprises a series of signal samples ordered in time and divided into frames , said method steps c (LP filter) omprising : measuring a distinguishing parameter from the input signal , determining from the measured distinguishing parameter whether the input signal contains an audio signal of a first type or a second type ;
and generating an output signal by encoding the input signal using either first or second coding methods according to whether the input signal contains an audio signal of the first type or the second type at that time , wherein the first coding method is CELP coding and the second coding method is transform coding , and wherein the input signal is coded on a frame-by-frame basis , the transform coding comprising encoding a frame using a discrete frequency domain transform of a range of samples from a plurality of neighboring frames , and wherein the CELP coding comprises generating the last CELP encoded frame prior to a switch from a mode of operation in which frames are encoded using the CELP coding to a mode of operation in which frames are encoded using transform coding by encoding an extended frame , the extended frame covering the same range of samples as the transform coding , so that a transform decoder can generate the information required to decode the first frame encoded using the transform coding from the last CELP encoded frame .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (signal samples) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6134518A
CLAIM 1
. Apparatus for digitally encoding an input audio signal for storage or transmission wherein the input audio signal comprises a series of signal samples (impulse responses) ordered in time and divided into frames , comprising : logic for measuring a distinguishing parameter from the input signal , determining means for determining from the measured distinguishing parameter whether the input signal contains an audio signal of a first type or a second type ;
first and second coders for digitally encoding the input signal using first and second coding methods respectively ;
a switching arrangement for , at any particular time , directing the generation of an output signal by encoding the input signal using either the first or second coders according to whether the input signal contains an audio signal of the first type or the second type at that time ;
and wherein the first coder is a Codebook Excited Linear Predictive (CELP) coder and the second coder is a transform coder , each coder being arranged to operate on a frame-by-frame basis , the transform coder being arranged to encode a frame using a discrete frequency domain transform of a range of samples from a plurality of neighboring frames , and wherein the CELP coder is arranged to encode an extended frame to generate the last CELP encoded data prior to a switch from a mode of operation in which frames are encoded using the transform coder , the extended frame covers the same range of sample as the transform coder , so that a transform decoder can generate the information required to decode the first frame encoded using the transform coder from the last CELP encoded frame .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (audio data) per sample for other frames .
US6134518A
CLAIM 6
. Apparatus for digitally decoding an input signal comprising coded data for a series of frames of audio data (average energy) , comprising : logic to detect an indication in the coded data stream for each frame as to whether the frame has been encoded using a first coder or a second coder ;
first and second decoders for digitally decoding the input signal using first and second decoding methods respectively ;
a switching arrangement , for each frame , directing the generation of an output signal by decoding the input signal using either the first or second decoders according to the detected indication ;
and wherein the first decoder is a CELP decoder and the second decoder is a transform decoder and when switching from the mode of operation of decoding CELP encoded frames to transform encoded frames , the transform coder uses the information in an extended CELP frame when decoding the first frame encoded using the transform coder .

US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US6134518A
CLAIM 20
. A program storage device readable by machine , tangibly embodying a program of instructions executable by the machine to perform method steps for causing a digitally encoding of an input audio signal for storage or transmission wherein the input audio signal comprises a series of signal samples ordered in time and divided into frames , said method steps c (LP filter) omprising : measuring a distinguishing parameter from the input signal , determining from the measured distinguishing parameter whether the input signal contains an audio signal of a first type or a second type ;
and generating an output signal by encoding the input signal using either first or second coding methods according to whether the input signal contains an audio signal of the first type or the second type at that time , wherein the first coding method is CELP coding and the second coding method is transform coding , and wherein the input signal is coded on a frame-by-frame basis , the transform coding comprising encoding a frame using a discrete frequency domain transform of a range of samples from a plurality of neighboring frames , and wherein the CELP coding comprises generating the last CELP encoded frame prior to a switch from a mode of operation in which frames are encoded using the CELP coding to a mode of operation in which frames are encoded using transform coding by encoding an extended frame , the extended frame covering the same range of samples as the transform coding , so that a transform decoder can generate the information required to decode the first frame encoded using the transform coding from the last CELP encoded frame .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (steps c) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E (first decode) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6134518A
CLAIM 6
. Apparatus for digitally decoding an input signal comprising coded data for a series of frames of audio data , comprising : logic to detect an indication in the coded data stream for each frame as to whether the frame has been encoded using a first coder or a second coder ;
first and second decoders for digitally decoding the input signal using first and second decoding methods respectively ;
a switching arrangement , for each frame , directing the generation of an output signal by decoding the input signal using either the first or second decoders according to the detected indication ;
and wherein the first decode (⁢ E) r is a CELP decoder and the second decoder is a transform decoder and when switching from the mode of operation of decoding CELP encoded frames to transform encoded frames , the transform coder uses the information in an extended CELP frame when decoding the first frame encoded using the transform coder .

US6134518A
CLAIM 20
. A program storage device readable by machine , tangibly embodying a program of instructions executable by the machine to perform method steps for causing a digitally encoding of an input audio signal for storage or transmission wherein the input audio signal comprises a series of signal samples ordered in time and divided into frames , said method steps c (LP filter) omprising : measuring a distinguishing parameter from the input signal , determining from the measured distinguishing parameter whether the input signal contains an audio signal of a first type or a second type ;
and generating an output signal by encoding the input signal using either first or second coding methods according to whether the input signal contains an audio signal of the first type or the second type at that time , wherein the first coding method is CELP coding and the second coding method is transform coding , and wherein the input signal is coded on a frame-by-frame basis , the transform coding comprising encoding a frame using a discrete frequency domain transform of a range of samples from a plurality of neighboring frames , and wherein the CELP coding comprises generating the last CELP encoded frame prior to a switch from a mode of operation in which frames are encoded using the CELP coding to a mode of operation in which frames are encoded using transform coding by encoding an extended frame , the extended frame covering the same range of samples as the transform coding , so that a transform decoder can generate the information required to decode the first frame encoded using the transform coding from the last CELP encoded frame .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (audio data) per sample for other frames .
US6134518A
CLAIM 6
. Apparatus for digitally decoding an input signal comprising coded data for a series of frames of audio data (average energy) , comprising : logic to detect an indication in the coded data stream for each frame as to whether the frame has been encoded using a first coder or a second coder ;
first and second decoders for digitally decoding the input signal using first and second decoding methods respectively ;
a switching arrangement , for each frame , directing the generation of an output signal by decoding the input signal using either the first or second decoders according to the detected indication ;
and wherein the first decoder is a CELP decoder and the second decoder is a transform decoder and when switching from the mode of operation of decoding CELP encoded frames to transform encoded frames , the transform coder uses the information in an extended CELP frame when decoding the first frame encoded using the transform coder .

US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (steps c) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E (first decode) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6134518A
CLAIM 6
. Apparatus for digitally decoding an input signal comprising coded data for a series of frames of audio data , comprising : logic to detect an indication in the coded data stream for each frame as to whether the frame has been encoded using a first coder or a second coder ;
first and second decoders for digitally decoding the input signal using first and second decoding methods respectively ;
a switching arrangement , for each frame , directing the generation of an output signal by decoding the input signal using either the first or second decoders according to the detected indication ;
and wherein the first decode (⁢ E) r is a CELP decoder and the second decoder is a transform decoder and when switching from the mode of operation of decoding CELP encoded frames to transform encoded frames , the transform coder uses the information in an extended CELP frame when decoding the first frame encoded using the transform coder .

US6134518A
CLAIM 12
. A coded representation (energy information parameter, phase information parameter) of an audio signal produced using a method as claim in claim 7 , and stored on a physical support .

US6134518A
CLAIM 20
. A program storage device readable by machine , tangibly embodying a program of instructions executable by the machine to perform method steps for causing a digitally encoding of an input audio signal for storage or transmission wherein the input audio signal comprises a series of signal samples ordered in time and divided into frames , said method steps c (LP filter) omprising : measuring a distinguishing parameter from the input signal , determining from the measured distinguishing parameter whether the input signal contains an audio signal of a first type or a second type ;
and generating an output signal by encoding the input signal using either first or second coding methods according to whether the input signal contains an audio signal of the first type or the second type at that time , wherein the first coding method is CELP coding and the second coding method is transform coding , and wherein the input signal is coded on a frame-by-frame basis , the transform coding comprising encoding a frame using a discrete frequency domain transform of a range of samples from a plurality of neighboring frames , and wherein the CELP coding comprises generating the last CELP encoded frame prior to a switch from a mode of operation in which frames are encoded using the CELP coding to a mode of operation in which frames are encoded using transform coding by encoding an extended frame , the extended frame covering the same range of samples as the transform coding , so that a transform decoder can generate the information required to decode the first frame encoded using the transform coding from the last CELP encoded frame .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6263312B1

Filed: 1998-03-02     Issued: 2001-07-17

Audio compression and decompression employing subband decomposition of residual signal and distortion reduction

(Original Assignee) Alaris Inc; G T Tech Inc     (Current Assignee) XVD TECHNOLOGY HOLDINGS Ltd (IRELAND)

Victor D. Kolesnik, Irina E. Bocharova, Boris D. Kudryashov, Eugene Ovsyannikov, Andrei N. Trofimov, Boris Troyanovsky
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (audio data) per sample for other frames .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data (average energy) , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E (first decode) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6263312B1
CLAIM 48
. A computer-implemented method of decompressing an audio signal that was compressed , said method comprising : decompressing a first transform encoded frame to generate a first synthesized signal frame ;
decompressing residual signal data associated with the first frame to generate a first set of residual signal subbands , the residual signal data representing the difference between the first frame of the original audio signal and the first transform encoded frame ;
wavelet reconstructing the first set of residual signal subbands using wavelets to generate a first synthesized residual signal frame ;
and adding the first synthesized signal frame and the first synthesized residual signal frame to generate a first decode (⁢ E) d audio signal frame .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E (first decode) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US6263312B1
CLAIM 48
. A computer-implemented method of decompressing an audio signal that was compressed , said method comprising : decompressing a first transform encoded frame to generate a first synthesized signal frame ;
decompressing residual signal data associated with the first frame to generate a first set of residual signal subbands , the residual signal data representing the difference between the first frame of the original audio signal and the first transform encoded frame ;
wavelet reconstructing the first set of residual signal subbands using wavelets to generate a first synthesized residual signal frame ;
and adding the first synthesized signal frame and the first synthesized residual signal frame to generate a first decode (⁢ E) d audio signal frame .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (audio data) per sample for other frames .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data (average energy) , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) , limits to a given value a gain used for scaling the synthesized sound signal .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E (first decode) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6263312B1
CLAIM 48
. A computer-implemented method of decompressing an audio signal that was compressed , said method comprising : decompressing a first transform encoded frame to generate a first synthesized signal frame ;
decompressing residual signal data associated with the first frame to generate a first set of residual signal subbands , the residual signal data representing the difference between the first frame of the original audio signal and the first transform encoded frame ;
wavelet reconstructing the first set of residual signal subbands using wavelets to generate a first synthesized residual signal frame ;
and adding the first synthesized signal frame and the first synthesized residual signal frame to generate a first decode (⁢ E) d audio signal frame .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (audio data) per sample for other frames .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data (average energy) , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (second frames) and decoder recovery (second residual signal, first residual signal) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E (first decode) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6263312B1
CLAIM 1
. A computer-implemented method for compressing audio data , comprising : encoding a first frame of an input audio signal to generate a first encoded signal ;
generating a first synthesized signal from the first encoded signal ;
generating a first residual signal (decoder concealment, decoder recovery) representing a difference between the first frame of the input audio signal and the first synthesized signal ;
wavelet decomposing the first residual signal into a first set of residual signal subbands ;
and encoding at least certain subbands in the first set of residual signal subbands .

US6263312B1
CLAIM 6
. The method of claim 5 , further comprising : determining that the first synthesized signal is sufficiently similar to the first frame of the input audio signal prior to said step of encoding at least certain subbands in the first set of residual signal subbands ;
and determining that the second synthesized signal is sufficiently dissimilar to the second frame of the input audio signal prior to said encoding at least certain subbands in the second set of residual signal subbands ;
and determining to encode the first and second frames (frame concealment) of the input audio signal differently based on said determining that the first synthesized signal is sufficiently similar and said determining that the second synthesized signal is sufficiently dissimilar .

US6263312B1
CLAIM 30
. The method of claim 28 , further comprising : encoding a second frame of an input audio signal to generate a second encoded signal ;
generating a second synthesized signal from the second encoded signal ;
determining that the second synthesized signal is sufficiently similar to the second frame of the input audio signal ;
generating a second residual signal (decoder concealment, decoder recovery) representing a difference between the second frame of the input audio signal and the second synthesized signal ;
decomposing the second residual signal into a second set of residual signal subbands ;
and encoding at least certain of the second set of residual signal subbands .

US6263312B1
CLAIM 48
. A computer-implemented method of decompressing an audio signal that was compressed , said method comprising : decompressing a first transform encoded frame to generate a first synthesized signal frame ;
decompressing residual signal data associated with the first frame to generate a first set of residual signal subbands , the residual signal data representing the difference between the first frame of the original audio signal and the first transform encoded frame ;
wavelet reconstructing the first set of residual signal subbands using wavelets to generate a first synthesized residual signal frame ;
and adding the first synthesized signal frame and the first synthesized residual signal frame to generate a first decode (⁢ E) d audio signal frame .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5963897A

Filed: 1998-02-27     Issued: 1999-10-05

Apparatus and method for hybrid excited linear prediction speech encoding

(Original Assignee) Lernout and Hauspie Speech Products NV     (Current Assignee) Nuance Communications Inc

Manel Guberna Alpuente, Jean-Francois Rasaminjanahary, Mohand Ferhaoui, Dirk Van Compernolle
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (redundancy information) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe (fixed number) affected by the artificial construction of the periodic part .
US5963897A
CLAIM 39
. An excitation signal generator as in claim 23 , wherein the excitation candidate generator uses a fixed number (last subframe) of single waveforms .

US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (redundancy information) and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (redundancy information) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (redundancy information) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5963897A
CLAIM 44
. A method of creating an excitation signal associated with a segment of input speech , the method comprising : a . forming a spectral signal representative of the spectral parameters of the segment of input speech ;
b . filtering the segment of input speech according to the spectral signal to form a perceptually weighted segment of input speech ;
c . producing a reference signal representative of the segment of input speech by subtracting from the perceptually weighted segment of input speech a signal representative of any previous modeled excitation sequence of the current segment of input speech ;
d . creating a set of excitation candidate signals , the set having at least one member , each excitation candidate signal comprised of a sequence of single waveforms , each waveform having a type , the sequence having at least one waveform , wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform ;
e . combining a given one of the excitation candidate signals with the spectral signal to form a set of synthetic speech signal (speech signal, decoder determines concealment) s , the set having at least one member , each synthetic speech signal representative of the segment of input speech ;
f . spectrally shaping each synthetic speech signal to form a set of perceptually weighted synthetic speech signals , the set having at least one member ;
g . determining a set of error signals by comparing the reference signal representative of the segment of input speech to each member of the set of perceptually weighted synthetic speech signals ;
h . selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate encoding ;
and i . if no excitation signal is selected , recursively creating a set of new excitation candidate signals according to step (d) wherein the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals , and repeating steps (e)-(i) .

US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (redundancy information) and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment (redundancy information) and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5963897A
CLAIM 44
. A method of creating an excitation signal associated with a segment of input speech , the method comprising : a . forming a spectral signal representative of the spectral parameters of the segment of input speech ;
b . filtering the segment of input speech according to the spectral signal to form a perceptually weighted segment of input speech ;
c . producing a reference signal representative of the segment of input speech by subtracting from the perceptually weighted segment of input speech a signal representative of any previous modeled excitation sequence of the current segment of input speech ;
d . creating a set of excitation candidate signals , the set having at least one member , each excitation candidate signal comprised of a sequence of single waveforms , each waveform having a type , the sequence having at least one waveform , wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform ;
e . combining a given one of the excitation candidate signals with the spectral signal to form a set of synthetic speech signal (speech signal, decoder determines concealment) s , the set having at least one member , each synthetic speech signal representative of the segment of input speech ;
f . spectrally shaping each synthetic speech signal to form a set of perceptually weighted synthetic speech signals , the set having at least one member ;
g . determining a set of error signals by comparing the reference signal representative of the segment of input speech to each member of the set of perceptually weighted synthetic speech signals ;
h . selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate encoding ;
and i . if no excitation signal is selected , recursively creating a set of new excitation candidate signals according to step (d) wherein the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals , and repeating steps (e)-(i) .

US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5963897A
CLAIM 44
. A method of creating an excitation signal associated with a segment of input speech , the method comprising : a . forming a spectral signal representative of the spectral parameters of the segment of input speech ;
b . filtering the segment of input speech according to the spectral signal to form a perceptually weighted segment of input speech ;
c . producing a reference signal representative of the segment of input speech by subtracting from the perceptually weighted segment of input speech a signal representative of any previous modeled excitation sequence of the current segment of input speech ;
d . creating a set of excitation candidate signals , the set having at least one member , each excitation candidate signal comprised of a sequence of single waveforms , each waveform having a type , the sequence having at least one waveform , wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform ;
e . combining a given one of the excitation candidate signals with the spectral signal to form a set of synthetic speech signal (speech signal, decoder determines concealment) s , the set having at least one member , each synthetic speech signal representative of the segment of input speech ;
f . spectrally shaping each synthetic speech signal to form a set of perceptually weighted synthetic speech signals , the set having at least one member ;
g . determining a set of error signals by comparing the reference signal representative of the segment of input speech to each member of the set of perceptually weighted synthetic speech signals ;
h . selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate encoding ;
and i . if no excitation signal is selected , recursively creating a set of new excitation candidate signals according to step (d) wherein the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals , and repeating steps (e)-(i) .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (redundancy information) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment (redundancy information) and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment (redundancy information) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe (fixed number) affected by the artificial construction of the periodic part .
US5963897A
CLAIM 39
. An excitation signal generator as in claim 23 , wherein the excitation candidate generator uses a fixed number (last subframe) of single waveforms .

US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (redundancy information) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (redundancy information) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (redundancy information) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5963897A
CLAIM 44
. A method of creating an excitation signal associated with a segment of input speech , the method comprising : a . forming a spectral signal representative of the spectral parameters of the segment of input speech ;
b . filtering the segment of input speech according to the spectral signal to form a perceptually weighted segment of input speech ;
c . producing a reference signal representative of the segment of input speech by subtracting from the perceptually weighted segment of input speech a signal representative of any previous modeled excitation sequence of the current segment of input speech ;
d . creating a set of excitation candidate signals , the set having at least one member , each excitation candidate signal comprised of a sequence of single waveforms , each waveform having a type , the sequence having at least one waveform , wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform ;
e . combining a given one of the excitation candidate signals with the spectral signal to form a set of synthetic speech signal (speech signal, decoder determines concealment) s , the set having at least one member , each synthetic speech signal representative of the segment of input speech ;
f . spectrally shaping each synthetic speech signal to form a set of perceptually weighted synthetic speech signals , the set having at least one member ;
g . determining a set of error signals by comparing the reference signal representative of the segment of input speech to each member of the set of perceptually weighted synthetic speech signals ;
h . selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate encoding ;
and i . if no excitation signal is selected , recursively creating a set of new excitation candidate signals according to step (d) wherein the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals , and repeating steps (e)-(i) .

US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (redundancy information) and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment (redundancy information) and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5963897A
CLAIM 44
. A method of creating an excitation signal associated with a segment of input speech , the method comprising : a . forming a spectral signal representative of the spectral parameters of the segment of input speech ;
b . filtering the segment of input speech according to the spectral signal to form a perceptually weighted segment of input speech ;
c . producing a reference signal representative of the segment of input speech by subtracting from the perceptually weighted segment of input speech a signal representative of any previous modeled excitation sequence of the current segment of input speech ;
d . creating a set of excitation candidate signals , the set having at least one member , each excitation candidate signal comprised of a sequence of single waveforms , each waveform having a type , the sequence having at least one waveform , wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform ;
e . combining a given one of the excitation candidate signals with the spectral signal to form a set of synthetic speech signal (speech signal, decoder determines concealment) s , the set having at least one member , each synthetic speech signal representative of the segment of input speech ;
f . spectrally shaping each synthetic speech signal to form a set of perceptually weighted synthetic speech signals , the set having at least one member ;
g . determining a set of error signals by comparing the reference signal representative of the segment of input speech to each member of the set of perceptually weighted synthetic speech signals ;
h . selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate encoding ;
and i . if no excitation signal is selected , recursively creating a set of new excitation candidate signals according to step (d) wherein the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals , and repeating steps (e)-(i) .

US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5963897A
CLAIM 44
. A method of creating an excitation signal associated with a segment of input speech , the method comprising : a . forming a spectral signal representative of the spectral parameters of the segment of input speech ;
b . filtering the segment of input speech according to the spectral signal to form a perceptually weighted segment of input speech ;
c . producing a reference signal representative of the segment of input speech by subtracting from the perceptually weighted segment of input speech a signal representative of any previous modeled excitation sequence of the current segment of input speech ;
d . creating a set of excitation candidate signals , the set having at least one member , each excitation candidate signal comprised of a sequence of single waveforms , each waveform having a type , the sequence having at least one waveform , wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform ;
e . combining a given one of the excitation candidate signals with the spectral signal to form a set of synthetic speech signal (speech signal, decoder determines concealment) s , the set having at least one member , each synthetic speech signal representative of the segment of input speech ;
f . spectrally shaping each synthetic speech signal to form a set of perceptually weighted synthetic speech signals , the set having at least one member ;
g . determining a set of error signals by comparing the reference signal representative of the segment of input speech to each member of the set of perceptually weighted synthetic speech signals ;
h . selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate encoding ;
and i . if no excitation signal is selected , recursively creating a set of new excitation candidate signals according to step (d) wherein the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals , and repeating steps (e)-(i) .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (redundancy information) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5963897A
CLAIM 44
. A method of creating an excitation signal associated with a segment of input speech , the method comprising : a . forming a spectral signal representative of the spectral parameters of the segment of input speech ;
b . filtering the segment of input speech according to the spectral signal to form a perceptually weighted segment of input speech ;
c . producing a reference signal representative of the segment of input speech by subtracting from the perceptually weighted segment of input speech a signal representative of any previous modeled excitation sequence of the current segment of input speech ;
d . creating a set of excitation candidate signals , the set having at least one member , each excitation candidate signal comprised of a sequence of single waveforms , each waveform having a type , the sequence having at least one waveform , wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform ;
e . combining a given one of the excitation candidate signals with the spectral signal to form a set of synthetic speech signal (speech signal, decoder determines concealment) s , the set having at least one member , each synthetic speech signal representative of the segment of input speech ;
f . spectrally shaping each synthetic speech signal to form a set of perceptually weighted synthetic speech signals , the set having at least one member ;
g . determining a set of error signals by comparing the reference signal representative of the segment of input speech to each member of the set of perceptually weighted synthetic speech signals ;
h . selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate encoding ;
and i . if no excitation signal is selected , recursively creating a set of new excitation candidate signals according to step (d) wherein the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals , and repeating steps (e)-(i) .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment (redundancy information) and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5963897A
CLAIM 110
. A method of creating an excitation signal associated with a segment of input speech according to claim 90 , wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information (frame erasure concealment, conducting frame erasure concealment) .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH11184498A

Filed: 1997-12-24     Issued: 1999-07-09

音声符号化/復号化方法

(Original Assignee) Toshiba Corp; 株式会社東芝     

Kimio Miseki, Katsumi Tsuchiya, 公生 三関, 勝美 土谷
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (入力音) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有すること (maximum amplitude) を特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声信号, 音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有すること (maximum amplitude) を特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号, 音声復号) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (入力音) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有すること (maximum amplitude) を特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有すること (maximum amplitude) を特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号, 音声復号) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JPH11184498A
CLAIM 1
【請求項1】LSF(線スペクトル周波数)パラメータ を介して入力音声信号 (sound signal, speech signal) のスペクトル包絡を表す音声パラ メータを符号化する過程を含む音声符号化方法におい て、 (a) 前記入力音声信号について自己相関係数を求めるス テップと、 (b) 前記自己相関係数を基にF(k)(k=1,2, …,N)で表される第1のLSFパラメータを得るステ ップと、 (c) 前記第1のLSFパラメータに対し、 f(k)=log C (1+A×F(k)) (A,Cは正の定数、k=1,2,…,N)なる変換を 行って、f(k)で表される第2のLSFパラメータを 得るステップと、 (d) 前記第2のLSFパラメータを量子化し、fq (k)で表される量子化された第3のLSFパラメータ および該第3のLSFパラメータを表す第1の符号を得 るステップと、 (e) 前記第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される第4のLSFパラメータを得るステッ プとを有することを特徴とする音声符号化方法。

JPH11184498A
CLAIM 5
【請求項5】請求項1〜3のいずれか1項に記載の音声 パラメータの符号化方法により得られた前記第1の符号 から該音声パラメータを復号化する過程を含む音声復号 (sound signal, speech signal) 化方法であって、 (a) 前記第1の符号に基づいて逆量子化を行い、fq (k)で表される前記第3のLSFパラメータを復号す るステップと、 (b) 復号された第3のLSFパラメータに対し、 Fq(k)=(C fq(k) −1)/A (k=1,2,…,N)なる逆変換を行って、Fq (k)で表される前記第4のLSFパラメータを得るス テップとを有することを特徴とする音声復号化方法。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6009388A

Filed: 1997-12-16     Issued: 1999-12-28

High quality speech code and coding method

(Original Assignee) NEC Corp     (Current Assignee) NEC Corp

Kazunori Ozawa
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (time length) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response (inverse filtering, impulse response, represents a, speech coder, response signal) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (inverse filtering, impulse response, represents a, speech coder, response signal) of the low-pass filter each with a distance corresponding to an average pitch value (judging unit) from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6009388A
CLAIM 1
. A speech coder (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length (pitch period) ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue signal from the input speech signal by using the first coefficient signal ;
a second coefficient analyzing unit operable to derive second coefficients representing a spectral characteristic of the predicted residue signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation signal generating unit operable to derive an excitation signal in accordance with the input speech signal in a particular frame , the first coefficient signal , the second coefficient signal and the quantized coefficient signal , the excitation signal generating unit including a quantizer operable to quantize the excitation signal and provide the quantized signal as a quantized excitation signal ;
and a speech reproducing unit operable to reproduce speech of the particular frame by using the first coefficient signal , the quantized coefficient signal and the quantized excitation signal to produce a speech reproduction signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 3
. A speech coder comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue from the input speech signal by using the first coefficients and provide a predicted gain signal representing a predicted gain calculated from the predicted residue ;
a judging unit (average pitch value) operable to determine whether the predicted gain represented by the predicted gain signal is above a predetermined threshold and provide a judge signal representing the result of the determination ;
a second coefficient analyzing unit operative , when the judge signal represents a (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) predetermined value , to derive second coefficients representing a spectral characteristic of the predicted gain from the predicted gain signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation generating unit operable to produce a quantized excitation signal in accordance with the input speech signal by quantizing the speech signal , the second coefficient signal and the quantized coefficient signal , the excitation generating unit using the second coefficients to produce the quantized excitation signal depending on the value of the judge signal ;
and a speech reproducing unit operable to produce a speech reproduction signal of a pertinent frame by using the second coefficients , the quantized coefficient signal and the quantized excitation signal , the speech reproducing unit using the first coefficients to produce the speech reproduction signal depending on the value of the judge signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 10
. A coder for producing an output speech signal from an input speech signal , comprising : a frame divider adapted to divide the input speech signal into time frames of a predetermined length ;
a first signal generator having a linear prediction analyzer to produce first linear prediction coefficients (FLPCs) from a predetermined number of samples of an output speech feedback signal , the FLPCs being of a predetermined degree ;
a residue signal generator adapted to produce a predictive residue signal as a function of inverse filtering (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) a predetermined number of samples of the input speech signal and the FLPCs ;
a second signal generator having a linear prediction analyzer to produce second linear prediction coefficients (SLPCs) from a predetermined number of samples of the predictive residue signal , the SLPCs being of a predetermined degree , the second signal generator having a linear spectrum pair (LSP) analyzer to produce LSP parameters from the SLPCs ;
a quantizer adapted to produce a quantized signal obtained by quantizing the LSP parameters ;
an excitation unit having an excitation quantizer , the excitation unit being adapted to produce a quantized excitation signal based on the input speech signal , the FLPCs , the SLPCs , and the quantized signal ;
and a speech reproducing unit adapted to produce a speech reproduction signal for each frame and the output speech feedback signal using the FLPCs , the quantized signal and the quantized excitation signal .

US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) from a z-transform circuit ;
a response signal (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (time length) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6009388A
CLAIM 1
. A speech coder comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length (pitch period) ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue signal from the input speech signal by using the first coefficient signal ;
a second coefficient analyzing unit operable to derive second coefficients representing a spectral characteristic of the predicted residue signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation signal generating unit operable to derive an excitation signal in accordance with the input speech signal in a particular frame , the first coefficient signal , the second coefficient signal and the quantized coefficient signal , the excitation signal generating unit including a quantizer operable to quantize the excitation signal and provide the quantized signal as a quantized excitation signal ;
and a speech reproducing unit operable to reproduce speech of the particular frame by using the first coefficient signal , the quantized coefficient signal and the quantized excitation signal to produce a speech reproduction signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number, second line) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US6009388A
CLAIM 10
. A coder for producing an output speech signal from an input speech signal , comprising : a frame divider adapted to divide the input speech signal into time frames of a predetermined length ;
a first signal generator having a linear prediction analyzer to produce first linear prediction coefficient (comfort noise) s (FLPCs) from a predetermined number (last non) of samples of an output speech feedback signal , the FLPCs being of a predetermined degree ;
a residue signal generator adapted to produce a predictive residue signal as a function of inverse filtering a predetermined number of samples of the input speech signal and the FLPCs ;
a second signal generator having a linear prediction analyzer to produce second linear prediction coefficients (SLPCs) from a predetermined number of samples of the predictive residue signal , the SLPCs being of a predetermined degree , the second signal generator having a linear spectrum pair (LSP) analyzer to produce LSP parameters from the SLPCs ;
a quantizer adapted to produce a quantized signal obtained by quantizing the LSP parameters ;
an excitation unit having an excitation quantizer , the excitation unit being adapted to produce a quantized excitation signal based on the input speech signal , the FLPCs , the SLPCs , and the quantized signal ;
and a speech reproducing unit adapted to produce a speech reproduction signal for each frame and the output speech feedback signal using the FLPCs , the quantized signal and the quantized excitation signal .

US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (inverse filtering, impulse response, represents a, speech coder, response signal) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6009388A
CLAIM 1
. A speech coder (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue signal from the input speech signal by using the first coefficient signal ;
a second coefficient analyzing unit operable to derive second coefficients representing a spectral characteristic of the predicted residue signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation signal generating unit operable to derive an excitation signal in accordance with the input speech signal in a particular frame , the first coefficient signal , the second coefficient signal and the quantized coefficient signal , the excitation signal generating unit including a quantizer operable to quantize the excitation signal and provide the quantized signal as a quantized excitation signal ;
and a speech reproducing unit operable to reproduce speech of the particular frame by using the first coefficient signal , the quantized coefficient signal and the quantized excitation signal to produce a speech reproduction signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 3
. A speech coder comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue from the input speech signal by using the first coefficients and provide a predicted gain signal representing a predicted gain calculated from the predicted residue ;
a judging unit operable to determine whether the predicted gain represented by the predicted gain signal is above a predetermined threshold and provide a judge signal representing the result of the determination ;
a second coefficient analyzing unit operative , when the judge signal represents a (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) predetermined value , to derive second coefficients representing a spectral characteristic of the predicted gain from the predicted gain signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation generating unit operable to produce a quantized excitation signal in accordance with the input speech signal by quantizing the speech signal , the second coefficient signal and the quantized coefficient signal , the excitation generating unit using the second coefficients to produce the quantized excitation signal depending on the value of the judge signal ;
and a speech reproducing unit operable to produce a speech reproduction signal of a pertinent frame by using the second coefficients , the quantized coefficient signal and the quantized excitation signal , the speech reproducing unit using the first coefficients to produce the speech reproduction signal depending on the value of the judge signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 10
. A coder for producing an output speech signal from an input speech signal , comprising : a frame divider adapted to divide the input speech signal into time frames of a predetermined length ;
a first signal generator having a linear prediction analyzer to produce first linear prediction coefficients (FLPCs) from a predetermined number of samples of an output speech feedback signal , the FLPCs being of a predetermined degree ;
a residue signal generator adapted to produce a predictive residue signal as a function of inverse filtering (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) a predetermined number of samples of the input speech signal and the FLPCs ;
a second signal generator having a linear prediction analyzer to produce second linear prediction coefficients (SLPCs) from a predetermined number of samples of the predictive residue signal , the SLPCs being of a predetermined degree , the second signal generator having a linear spectrum pair (LSP) analyzer to produce LSP parameters from the SLPCs ;
a quantizer adapted to produce a quantized signal obtained by quantizing the LSP parameters ;
an excitation unit having an excitation quantizer , the excitation unit being adapted to produce a quantized excitation signal based on the input speech signal , the FLPCs , the SLPCs , and the quantized signal ;
and a speech reproducing unit adapted to produce a speech reproduction signal for each frame and the output speech feedback signal using the FLPCs , the quantized signal and the quantized excitation signal .

US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) from a z-transform circuit ;
a response signal (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (inverse filtering, impulse response, represents a, speech coder, response signal) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (inverse filtering, impulse response, represents a, speech coder, response signal) of the LP filter of a last non (predetermined number, second line) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6009388A
CLAIM 1
. A speech coder (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue signal from the input speech signal by using the first coefficient signal ;
a second coefficient analyzing unit operable to derive second coefficients representing a spectral characteristic of the predicted residue signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation signal generating unit operable to derive an excitation signal in accordance with the input speech signal in a particular frame , the first coefficient signal , the second coefficient signal and the quantized coefficient signal , the excitation signal generating unit including a quantizer operable to quantize the excitation signal and provide the quantized signal as a quantized excitation signal ;
and a speech reproducing unit operable to reproduce speech of the particular frame by using the first coefficient signal , the quantized coefficient signal and the quantized excitation signal to produce a speech reproduction signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 3
. A speech coder comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue from the input speech signal by using the first coefficients and provide a predicted gain signal representing a predicted gain calculated from the predicted residue ;
a judging unit operable to determine whether the predicted gain represented by the predicted gain signal is above a predetermined threshold and provide a judge signal representing the result of the determination ;
a second coefficient analyzing unit operative , when the judge signal represents a (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) predetermined value , to derive second coefficients representing a spectral characteristic of the predicted gain from the predicted gain signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation generating unit operable to produce a quantized excitation signal in accordance with the input speech signal by quantizing the speech signal , the second coefficient signal and the quantized coefficient signal , the excitation generating unit using the second coefficients to produce the quantized excitation signal depending on the value of the judge signal ;
and a speech reproducing unit operable to produce a speech reproduction signal of a pertinent frame by using the second coefficients , the quantized coefficient signal and the quantized excitation signal , the speech reproducing unit using the first coefficients to produce the speech reproduction signal depending on the value of the judge signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 10
. A coder for producing an output speech signal from an input speech signal , comprising : a frame divider adapted to divide the input speech signal into time frames of a predetermined length ;
a first signal generator having a linear prediction analyzer to produce first linear prediction coefficients (FLPCs) from a predetermined number (last non) of samples of an output speech feedback signal , the FLPCs being of a predetermined degree ;
a residue signal generator adapted to produce a predictive residue signal as a function of inverse filtering (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) a predetermined number of samples of the input speech signal and the FLPCs ;
a second signal generator having a linear prediction analyzer to produce second line (last non) ar prediction coefficients (SLPCs) from a predetermined number of samples of the predictive residue signal , the SLPCs being of a predetermined degree , the second signal generator having a linear spectrum pair (LSP) analyzer to produce LSP parameters from the SLPCs ;
a quantizer adapted to produce a quantized signal obtained by quantizing the LSP parameters ;
an excitation unit having an excitation quantizer , the excitation unit being adapted to produce a quantized excitation signal based on the input speech signal , the FLPCs , the SLPCs , and the quantized signal ;
and a speech reproducing unit adapted to produce a speech reproduction signal for each frame and the output speech feedback signal using the FLPCs , the quantized signal and the quantized excitation signal .

US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) from a z-transform circuit ;
a response signal (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook unit adapted to determine a pitch p (E q) rediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (time length) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6009388A
CLAIM 1
. A speech coder comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length (pitch period) ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue signal from the input speech signal by using the first coefficient signal ;
a second coefficient analyzing unit operable to derive second coefficients representing a spectral characteristic of the predicted residue signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation signal generating unit operable to derive an excitation signal in accordance with the input speech signal in a particular frame , the first coefficient signal , the second coefficient signal and the quantized coefficient signal , the excitation signal generating unit including a quantizer operable to quantize the excitation signal and provide the quantized signal as a quantized excitation signal ;
and a speech reproducing unit operable to reproduce speech of the particular frame by using the first coefficient signal , the quantized coefficient signal and the quantized excitation signal to produce a speech reproduction signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (inverse filtering, impulse response, represents a, speech coder, response signal) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (inverse filtering, impulse response, represents a, speech coder, response signal) of the LP filter of a last non (predetermined number, second line) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6009388A
CLAIM 1
. A speech coder (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue signal from the input speech signal by using the first coefficient signal ;
a second coefficient analyzing unit operable to derive second coefficients representing a spectral characteristic of the predicted residue signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation signal generating unit operable to derive an excitation signal in accordance with the input speech signal in a particular frame , the first coefficient signal , the second coefficient signal and the quantized coefficient signal , the excitation signal generating unit including a quantizer operable to quantize the excitation signal and provide the quantized signal as a quantized excitation signal ;
and a speech reproducing unit operable to reproduce speech of the particular frame by using the first coefficient signal , the quantized coefficient signal and the quantized excitation signal to produce a speech reproduction signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 3
. A speech coder comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue from the input speech signal by using the first coefficients and provide a predicted gain signal representing a predicted gain calculated from the predicted residue ;
a judging unit operable to determine whether the predicted gain represented by the predicted gain signal is above a predetermined threshold and provide a judge signal representing the result of the determination ;
a second coefficient analyzing unit operative , when the judge signal represents a (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) predetermined value , to derive second coefficients representing a spectral characteristic of the predicted gain from the predicted gain signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation generating unit operable to produce a quantized excitation signal in accordance with the input speech signal by quantizing the speech signal , the second coefficient signal and the quantized coefficient signal , the excitation generating unit using the second coefficients to produce the quantized excitation signal depending on the value of the judge signal ;
and a speech reproducing unit operable to produce a speech reproduction signal of a pertinent frame by using the second coefficients , the quantized coefficient signal and the quantized excitation signal , the speech reproducing unit using the first coefficients to produce the speech reproduction signal depending on the value of the judge signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 10
. A coder for producing an output speech signal from an input speech signal , comprising : a frame divider adapted to divide the input speech signal into time frames of a predetermined length ;
a first signal generator having a linear prediction analyzer to produce first linear prediction coefficients (FLPCs) from a predetermined number (last non) of samples of an output speech feedback signal , the FLPCs being of a predetermined degree ;
a residue signal generator adapted to produce a predictive residue signal as a function of inverse filtering (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) a predetermined number of samples of the input speech signal and the FLPCs ;
a second signal generator having a linear prediction analyzer to produce second line (last non) ar prediction coefficients (SLPCs) from a predetermined number of samples of the predictive residue signal , the SLPCs being of a predetermined degree , the second signal generator having a linear spectrum pair (LSP) analyzer to produce LSP parameters from the SLPCs ;
a quantizer adapted to produce a quantized signal obtained by quantizing the LSP parameters ;
an excitation unit having an excitation quantizer , the excitation unit being adapted to produce a quantized excitation signal based on the input speech signal , the FLPCs , the SLPCs , and the quantized signal ;
and a speech reproducing unit adapted to produce a speech reproduction signal for each frame and the output speech feedback signal using the FLPCs , the quantized signal and the quantized excitation signal .

US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) from a z-transform circuit ;
a response signal (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch p (E q) rediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (time length) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response (inverse filtering, impulse response, represents a, speech coder, response signal) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (inverse filtering, impulse response, represents a, speech coder, response signal) of the low-pass filter each with a distance corresponding to an average pitch value (judging unit) from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6009388A
CLAIM 1
. A speech coder (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length (pitch period) ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue signal from the input speech signal by using the first coefficient signal ;
a second coefficient analyzing unit operable to derive second coefficients representing a spectral characteristic of the predicted residue signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation signal generating unit operable to derive an excitation signal in accordance with the input speech signal in a particular frame , the first coefficient signal , the second coefficient signal and the quantized coefficient signal , the excitation signal generating unit including a quantizer operable to quantize the excitation signal and provide the quantized signal as a quantized excitation signal ;
and a speech reproducing unit operable to reproduce speech of the particular frame by using the first coefficient signal , the quantized coefficient signal and the quantized excitation signal to produce a speech reproduction signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 3
. A speech coder comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue from the input speech signal by using the first coefficients and provide a predicted gain signal representing a predicted gain calculated from the predicted residue ;
a judging unit (average pitch value) operable to determine whether the predicted gain represented by the predicted gain signal is above a predetermined threshold and provide a judge signal representing the result of the determination ;
a second coefficient analyzing unit operative , when the judge signal represents a (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) predetermined value , to derive second coefficients representing a spectral characteristic of the predicted gain from the predicted gain signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation generating unit operable to produce a quantized excitation signal in accordance with the input speech signal by quantizing the speech signal , the second coefficient signal and the quantized coefficient signal , the excitation generating unit using the second coefficients to produce the quantized excitation signal depending on the value of the judge signal ;
and a speech reproducing unit operable to produce a speech reproduction signal of a pertinent frame by using the second coefficients , the quantized coefficient signal and the quantized excitation signal , the speech reproducing unit using the first coefficients to produce the speech reproduction signal depending on the value of the judge signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 10
. A coder for producing an output speech signal from an input speech signal , comprising : a frame divider adapted to divide the input speech signal into time frames of a predetermined length ;
a first signal generator having a linear prediction analyzer to produce first linear prediction coefficients (FLPCs) from a predetermined number of samples of an output speech feedback signal , the FLPCs being of a predetermined degree ;
a residue signal generator adapted to produce a predictive residue signal as a function of inverse filtering (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) a predetermined number of samples of the input speech signal and the FLPCs ;
a second signal generator having a linear prediction analyzer to produce second linear prediction coefficients (SLPCs) from a predetermined number of samples of the predictive residue signal , the SLPCs being of a predetermined degree , the second signal generator having a linear spectrum pair (LSP) analyzer to produce LSP parameters from the SLPCs ;
a quantizer adapted to produce a quantized signal obtained by quantizing the LSP parameters ;
an excitation unit having an excitation quantizer , the excitation unit being adapted to produce a quantized excitation signal based on the input speech signal , the FLPCs , the SLPCs , and the quantized signal ;
and a speech reproducing unit adapted to produce a speech reproduction signal for each frame and the output speech feedback signal using the FLPCs , the quantized signal and the quantized excitation signal .

US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) from a z-transform circuit ;
a response signal (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (time length) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6009388A
CLAIM 1
. A speech coder comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length (pitch period) ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue signal from the input speech signal by using the first coefficient signal ;
a second coefficient analyzing unit operable to derive second coefficients representing a spectral characteristic of the predicted residue signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation signal generating unit operable to derive an excitation signal in accordance with the input speech signal in a particular frame , the first coefficient signal , the second coefficient signal and the quantized coefficient signal , the excitation signal generating unit including a quantizer operable to quantize the excitation signal and provide the quantized signal as a quantized excitation signal ;
and a speech reproducing unit operable to reproduce speech of the particular frame by using the first coefficient signal , the quantized coefficient signal and the quantized excitation signal to produce a speech reproduction signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number, second line) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US6009388A
CLAIM 10
. A coder for producing an output speech signal from an input speech signal , comprising : a frame divider adapted to divide the input speech signal into time frames of a predetermined length ;
a first signal generator having a linear prediction analyzer to produce first linear prediction coefficient (comfort noise) s (FLPCs) from a predetermined number (last non) of samples of an output speech feedback signal , the FLPCs being of a predetermined degree ;
a residue signal generator adapted to produce a predictive residue signal as a function of inverse filtering a predetermined number of samples of the input speech signal and the FLPCs ;
a second signal generator having a linear prediction analyzer to produce second linear prediction coefficients (SLPCs) from a predetermined number of samples of the predictive residue signal , the SLPCs being of a predetermined degree , the second signal generator having a linear spectrum pair (LSP) analyzer to produce LSP parameters from the SLPCs ;
a quantizer adapted to produce a quantized signal obtained by quantizing the LSP parameters ;
an excitation unit having an excitation quantizer , the excitation unit being adapted to produce a quantized excitation signal based on the input speech signal , the FLPCs , the SLPCs , and the quantized signal ;
and a speech reproducing unit adapted to produce a speech reproduction signal for each frame and the output speech feedback signal using the FLPCs , the quantized signal and the quantized excitation signal .

US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (inverse filtering, impulse response, represents a, speech coder, response signal) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6009388A
CLAIM 1
. A speech coder (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue signal from the input speech signal by using the first coefficient signal ;
a second coefficient analyzing unit operable to derive second coefficients representing a spectral characteristic of the predicted residue signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation signal generating unit operable to derive an excitation signal in accordance with the input speech signal in a particular frame , the first coefficient signal , the second coefficient signal and the quantized coefficient signal , the excitation signal generating unit including a quantizer operable to quantize the excitation signal and provide the quantized signal as a quantized excitation signal ;
and a speech reproducing unit operable to reproduce speech of the particular frame by using the first coefficient signal , the quantized coefficient signal and the quantized excitation signal to produce a speech reproduction signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 3
. A speech coder comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue from the input speech signal by using the first coefficients and provide a predicted gain signal representing a predicted gain calculated from the predicted residue ;
a judging unit operable to determine whether the predicted gain represented by the predicted gain signal is above a predetermined threshold and provide a judge signal representing the result of the determination ;
a second coefficient analyzing unit operative , when the judge signal represents a (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) predetermined value , to derive second coefficients representing a spectral characteristic of the predicted gain from the predicted gain signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation generating unit operable to produce a quantized excitation signal in accordance with the input speech signal by quantizing the speech signal , the second coefficient signal and the quantized coefficient signal , the excitation generating unit using the second coefficients to produce the quantized excitation signal depending on the value of the judge signal ;
and a speech reproducing unit operable to produce a speech reproduction signal of a pertinent frame by using the second coefficients , the quantized coefficient signal and the quantized excitation signal , the speech reproducing unit using the first coefficients to produce the speech reproduction signal depending on the value of the judge signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 10
. A coder for producing an output speech signal from an input speech signal , comprising : a frame divider adapted to divide the input speech signal into time frames of a predetermined length ;
a first signal generator having a linear prediction analyzer to produce first linear prediction coefficients (FLPCs) from a predetermined number of samples of an output speech feedback signal , the FLPCs being of a predetermined degree ;
a residue signal generator adapted to produce a predictive residue signal as a function of inverse filtering (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) a predetermined number of samples of the input speech signal and the FLPCs ;
a second signal generator having a linear prediction analyzer to produce second linear prediction coefficients (SLPCs) from a predetermined number of samples of the predictive residue signal , the SLPCs being of a predetermined degree , the second signal generator having a linear spectrum pair (LSP) analyzer to produce LSP parameters from the SLPCs ;
a quantizer adapted to produce a quantized signal obtained by quantizing the LSP parameters ;
an excitation unit having an excitation quantizer , the excitation unit being adapted to produce a quantized excitation signal based on the input speech signal , the FLPCs , the SLPCs , and the quantized signal ;
and a speech reproducing unit adapted to produce a speech reproduction signal for each frame and the output speech feedback signal using the FLPCs , the quantized signal and the quantized excitation signal .

US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) from a z-transform circuit ;
a response signal (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (inverse filtering, impulse response, represents a, speech coder, response signal) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (inverse filtering, impulse response, represents a, speech coder, response signal) of a LP filter of a last non (predetermined number, second line) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6009388A
CLAIM 1
. A speech coder (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue signal from the input speech signal by using the first coefficient signal ;
a second coefficient analyzing unit operable to derive second coefficients representing a spectral characteristic of the predicted residue signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation signal generating unit operable to derive an excitation signal in accordance with the input speech signal in a particular frame , the first coefficient signal , the second coefficient signal and the quantized coefficient signal , the excitation signal generating unit including a quantizer operable to quantize the excitation signal and provide the quantized signal as a quantized excitation signal ;
and a speech reproducing unit operable to reproduce speech of the particular frame by using the first coefficient signal , the quantized coefficient signal and the quantized excitation signal to produce a speech reproduction signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 3
. A speech coder comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue from the input speech signal by using the first coefficients and provide a predicted gain signal representing a predicted gain calculated from the predicted residue ;
a judging unit operable to determine whether the predicted gain represented by the predicted gain signal is above a predetermined threshold and provide a judge signal representing the result of the determination ;
a second coefficient analyzing unit operative , when the judge signal represents a (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) predetermined value , to derive second coefficients representing a spectral characteristic of the predicted gain from the predicted gain signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation generating unit operable to produce a quantized excitation signal in accordance with the input speech signal by quantizing the speech signal , the second coefficient signal and the quantized coefficient signal , the excitation generating unit using the second coefficients to produce the quantized excitation signal depending on the value of the judge signal ;
and a speech reproducing unit operable to produce a speech reproduction signal of a pertinent frame by using the second coefficients , the quantized coefficient signal and the quantized excitation signal , the speech reproducing unit using the first coefficients to produce the speech reproduction signal depending on the value of the judge signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 10
. A coder for producing an output speech signal from an input speech signal , comprising : a frame divider adapted to divide the input speech signal into time frames of a predetermined length ;
a first signal generator having a linear prediction analyzer to produce first linear prediction coefficients (FLPCs) from a predetermined number (last non) of samples of an output speech feedback signal , the FLPCs being of a predetermined degree ;
a residue signal generator adapted to produce a predictive residue signal as a function of inverse filtering (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) a predetermined number of samples of the input speech signal and the FLPCs ;
a second signal generator having a linear prediction analyzer to produce second line (last non) ar prediction coefficients (SLPCs) from a predetermined number of samples of the predictive residue signal , the SLPCs being of a predetermined degree , the second signal generator having a linear spectrum pair (LSP) analyzer to produce LSP parameters from the SLPCs ;
a quantizer adapted to produce a quantized signal obtained by quantizing the LSP parameters ;
an excitation unit having an excitation quantizer , the excitation unit being adapted to produce a quantized excitation signal based on the input speech signal , the FLPCs , the SLPCs , and the quantized signal ;
and a speech reproducing unit adapted to produce a speech reproduction signal for each frame and the output speech feedback signal using the FLPCs , the quantized signal and the quantized excitation signal .

US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) from a z-transform circuit ;
a response signal (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook unit adapted to determine a pitch p (E q) rediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (time length) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6009388A
CLAIM 1
. A speech coder comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length (pitch period) ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue signal from the input speech signal by using the first coefficient signal ;
a second coefficient analyzing unit operable to derive second coefficients representing a spectral characteristic of the predicted residue signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation signal generating unit operable to derive an excitation signal in accordance with the input speech signal in a particular frame , the first coefficient signal , the second coefficient signal and the quantized coefficient signal , the excitation signal generating unit including a quantizer operable to quantize the excitation signal and provide the quantized signal as a quantized excitation signal ;
and a speech reproducing unit operable to reproduce speech of the particular frame by using the first coefficient signal , the quantized coefficient signal and the quantized excitation signal to produce a speech reproduction signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response from a z-transform circuit ;
a response signal generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch prediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (inverse filtering, impulse response, represents a, speech coder, response signal) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (inverse filtering, impulse response, represents a, speech coder, response signal) of a LP filter of a last non (predetermined number, second line) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6009388A
CLAIM 1
. A speech coder (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue signal from the input speech signal by using the first coefficient signal ;
a second coefficient analyzing unit operable to derive second coefficients representing a spectral characteristic of the predicted residue signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation signal generating unit operable to derive an excitation signal in accordance with the input speech signal in a particular frame , the first coefficient signal , the second coefficient signal and the quantized coefficient signal , the excitation signal generating unit including a quantizer operable to quantize the excitation signal and provide the quantized signal as a quantized excitation signal ;
and a speech reproducing unit operable to reproduce speech of the particular frame by using the first coefficient signal , the quantized coefficient signal and the quantized excitation signal to produce a speech reproduction signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 3
. A speech coder comprising : a divider operable to divide an input speech signal into a plurality of frames having a predetermined time length ;
a first coefficient analyzing unit operable to derive first coefficients representing a spectral characteristic of a past speech reproduction signal and provide the first coefficients as a first coefficient signal ;
a residue generating unit operable to derive a predicted residue from the input speech signal by using the first coefficients and provide a predicted gain signal representing a predicted gain calculated from the predicted residue ;
a judging unit operable to determine whether the predicted gain represented by the predicted gain signal is above a predetermined threshold and provide a judge signal representing the result of the determination ;
a second coefficient analyzing unit operative , when the judge signal represents a (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) predetermined value , to derive second coefficients representing a spectral characteristic of the predicted gain from the predicted gain signal and provide the second coefficients as a second coefficient signal ;
a coefficient quantizing unit operable to quantize the second coefficients represented by the second coefficient signal and provide the quantized second coefficients as a quantized coefficient signal ;
an excitation generating unit operable to produce a quantized excitation signal in accordance with the input speech signal by quantizing the speech signal , the second coefficient signal and the quantized coefficient signal , the excitation generating unit using the second coefficients to produce the quantized excitation signal depending on the value of the judge signal ;
and a speech reproducing unit operable to produce a speech reproduction signal of a pertinent frame by using the second coefficients , the quantized coefficient signal and the quantized excitation signal , the speech reproducing unit using the first coefficients to produce the speech reproduction signal depending on the value of the judge signal ;
the past speech reproduction signal being derived from the speech reproduction signal .

US6009388A
CLAIM 10
. A coder for producing an output speech signal from an input speech signal , comprising : a frame divider adapted to divide the input speech signal into time frames of a predetermined length ;
a first signal generator having a linear prediction analyzer to produce first linear prediction coefficients (FLPCs) from a predetermined number (last non) of samples of an output speech feedback signal , the FLPCs being of a predetermined degree ;
a residue signal generator adapted to produce a predictive residue signal as a function of inverse filtering (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) a predetermined number of samples of the input speech signal and the FLPCs ;
a second signal generator having a linear prediction analyzer to produce second line (last non) ar prediction coefficients (SLPCs) from a predetermined number of samples of the predictive residue signal , the SLPCs being of a predetermined degree , the second signal generator having a linear spectrum pair (LSP) analyzer to produce LSP parameters from the SLPCs ;
a quantizer adapted to produce a quantized signal obtained by quantizing the LSP parameters ;
an excitation unit having an excitation quantizer , the excitation unit being adapted to produce a quantized excitation signal based on the input speech signal , the FLPCs , the SLPCs , and the quantized signal ;
and a speech reproducing unit adapted to produce a speech reproduction signal for each frame and the output speech feedback signal using the FLPCs , the quantized signal and the quantized excitation signal .

US6009388A
CLAIM 15
. The coder of claim 10 , wherein the excitation unit comprises : an acoustical weighing circuit having a linear prediction analyzer adapted to produce third linear prediction coefficients (TLPCs) from the input speech signal , the acoustical weighing circuit also having a filter adapted to receive the TLPCs to produce a weighted speech signal ;
an impulse generator adapted to produce an impulse response (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) from a z-transform circuit ;
a response signal (LP filter, LP filter excitation signal, decoder determines concealment, impulse responses, impulse response) generator adapted to produce a response signal representing the input speech signal at zero value from the FLPCs , SLPCs , quantized signal and stored memory values ;
a subtractor adapted to produce a subtraction signal by subtracting the response signal from the weighted speech signal ;
an adaptive codebook (sound signal, speech signal) unit adapted to determine a pitch p (E q) rediction signal as a function of a delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal , the adaptive codebook unit also being adapted to produce a pitch prediction residue signal as a function of a gain value , the delay T , the subtraction signal , the impulse response from the impulse generator , and a past sample of the excitation signal ;
and an excitation quantizer adapted to produce a quantized excitation signal from the impulse response from the impulse generator , pitch prediction signal , and the pitch prediction signal .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5870412A

Filed: 1997-12-12     Issued: 1999-02-09

Forward error correction system for packet based real time media

(Original Assignee) 3Com Corp     (Current Assignee) HP Inc ; Hewlett Packard Enterprise Development LP

Guido M. Schuster, Jerry Mahler, Ikhlaq Sidhu, Michael Borella
US7693710B2
CLAIM 1
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction (concealing frame erasure) code FEC k! comprising the XOR sum of a predetermined number of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction (concealing frame erasure) code FEC k! comprising the XOR sum of a predetermined number of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction (concealing frame erasure) code FEC k! comprising the XOR sum of a predetermined number of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction (concealing frame erasure) code FEC k! comprising the XOR sum of a predetermined number of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (predetermined number, said transmission, sequence number) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction (concealing frame erasure) code FEC k! comprising the XOR sum of a predetermined number (last non, first non) of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US5870412A
CLAIM 12
. A method of recovering a lost packet in a sequence of packets transmitted in a telecommunications system , each packet in said sequence defining a sequence number (last non, first non) and carrying a payload block and a redundancy block , said redundancy block in a given packet being defined by an XOR sum of a predetermined number of preceding payload blocks in said sequence , said method comprising , in combination : (a) receiving an incoming packet of said sequence ;
(b) establishing a window of analysis beginning with said incoming packet and extending for said predetermined number of packets of said sequence following said incoming packet ;
and (c) if only one payload block in said window of analysis has not yet been received , recovering said one payload block by taking an XOR sum of a plurality of payload blocks within said window of analysis .

US5870412A
CLAIM 21
. An apparatus for transmitting payload in a digital communications system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said apparatus comprising , in combination : a first segment for generating , for each payload packet PL k! , a forward error correction code FEC k!=PL k-1! XOR PL k-2! XOR . . . , XOR PL k-w! ;
a second segment for appending said forward error correction code FEC k! to said payload packet PL k! , said payload packet PL k! and forward error correction code FEC k! defining , in combination , a packet P k! ;
and a third segment for transmitting a sequence of said packets from a first location in said digital transmission system to a second location in said digital transmission system , whereby , when said sequence of packets is transmitted from said first location in said digital communications system to said second location in said digital communications system , if a packet P i! is lost in said transmission (last non, first non) , payload packet PL i! may be recreated by extracting PL i! from one or more other packets .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (predetermined number, said transmission, sequence number) erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction code FEC k! comprising the XOR sum of a predetermined number (last non, first non) of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US5870412A
CLAIM 12
. A method of recovering a lost packet in a sequence of packets transmitted in a telecommunications system , each packet in said sequence defining a sequence number (last non, first non) and carrying a payload block and a redundancy block , said redundancy block in a given packet being defined by an XOR sum of a predetermined number of preceding payload blocks in said sequence , said method comprising , in combination : (a) receiving an incoming packet of said sequence ;
(b) establishing a window of analysis beginning with said incoming packet and extending for said predetermined number of packets of said sequence following said incoming packet ;
and (c) if only one payload block in said window of analysis has not yet been received , recovering said one payload block by taking an XOR sum of a plurality of payload blocks within said window of analysis .

US5870412A
CLAIM 21
. An apparatus for transmitting payload in a digital communications system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said apparatus comprising , in combination : a first segment for generating , for each payload packet PL k! , a forward error correction code FEC k!=PL k-1! XOR PL k-2! XOR . . . , XOR PL k-w! ;
a second segment for appending said forward error correction code FEC k! to said payload packet PL k! , said payload packet PL k! and forward error correction code FEC k! defining , in combination , a packet P k! ;
and a third segment for transmitting a sequence of said packets from a first location in said digital transmission system to a second location in said digital transmission system , whereby , when said sequence of packets is transmitted from said first location in said digital communications system to said second location in said digital communications system , if a packet P i! is lost in said transmission (last non, first non) , payload packet PL i! may be recreated by extracting PL i! from one or more other packets .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (predetermined number, said transmission, sequence number) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number, said transmission, sequence number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction code FEC k! comprising the XOR sum of a predetermined number (last non, first non) of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US5870412A
CLAIM 12
. A method of recovering a lost packet in a sequence of packets transmitted in a telecommunications system , each packet in said sequence defining a sequence number (last non, first non) and carrying a payload block and a redundancy block , said redundancy block in a given packet being defined by an XOR sum of a predetermined number of preceding payload blocks in said sequence , said method comprising , in combination : (a) receiving an incoming packet of said sequence ;
(b) establishing a window of analysis beginning with said incoming packet and extending for said predetermined number of packets of said sequence following said incoming packet ;
and (c) if only one payload block in said window of analysis has not yet been received , recovering said one payload block by taking an XOR sum of a plurality of payload blocks within said window of analysis .

US5870412A
CLAIM 21
. An apparatus for transmitting payload in a digital communications system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said apparatus comprising , in combination : a first segment for generating , for each payload packet PL k! , a forward error correction code FEC k!=PL k-1! XOR PL k-2! XOR . . . , XOR PL k-w! ;
a second segment for appending said forward error correction code FEC k! to said payload packet PL k! , said payload packet PL k! and forward error correction code FEC k! defining , in combination , a packet P k! ;
and a third segment for transmitting a sequence of said packets from a first location in said digital transmission system to a second location in said digital transmission system , whereby , when said sequence of packets is transmitted from said first location in said digital communications system to said second location in said digital communications system , if a packet P i! is lost in said transmission (last non, first non) , payload packet PL i! may be recreated by extracting PL i! from one or more other packets .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (predetermined number, said transmission, sequence number) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction (concealing frame erasure) code FEC k! comprising the XOR sum of a predetermined number (last non, first non) of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US5870412A
CLAIM 6
. A method as claimed in claim 1 , wherein said payload represents a (LP filter excitation signal) real-time media signal .

US5870412A
CLAIM 12
. A method of recovering a lost packet in a sequence of packets transmitted in a telecommunications system , each packet in said sequence defining a sequence number (last non, first non) and carrying a payload block and a redundancy block , said redundancy block in a given packet being defined by an XOR sum of a predetermined number of preceding payload blocks in said sequence , said method comprising , in combination : (a) receiving an incoming packet of said sequence ;
(b) establishing a window of analysis beginning with said incoming packet and extending for said predetermined number of packets of said sequence following said incoming packet ;
and (c) if only one payload block in said window of analysis has not yet been received , recovering said one payload block by taking an XOR sum of a plurality of payload blocks within said window of analysis .

US5870412A
CLAIM 21
. An apparatus for transmitting payload in a digital communications system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said apparatus comprising , in combination : a first segment for generating , for each payload packet PL k! , a forward error correction code FEC k!=PL k-1! XOR PL k-2! XOR . . . , XOR PL k-w! ;
a second segment for appending said forward error correction code FEC k! to said payload packet PL k! , said payload packet PL k! and forward error correction code FEC k! defining , in combination , a packet P k! ;
and a third segment for transmitting a sequence of said packets from a first location in said digital transmission system to a second location in said digital transmission system , whereby , when said sequence of packets is transmitted from said first location in said digital communications system to said second location in said digital communications system , if a packet P i! is lost in said transmission (last non, first non) , payload packet PL i! may be recreated by extracting PL i! from one or more other packets .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (represents a) produced in the decoder during the received first non (predetermined number, said transmission, sequence number) erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number, said transmission, sequence number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction code FEC k! comprising the XOR sum of a predetermined number (last non, first non) of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US5870412A
CLAIM 6
. A method as claimed in claim 1 , wherein said payload represents a (LP filter excitation signal) real-time media signal .

US5870412A
CLAIM 12
. A method of recovering a lost packet in a sequence of packets transmitted in a telecommunications system , each packet in said sequence defining a sequence number (last non, first non) and carrying a payload block and a redundancy block , said redundancy block in a given packet being defined by an XOR sum of a predetermined number of preceding payload blocks in said sequence , said method comprising , in combination : (a) receiving an incoming packet of said sequence ;
(b) establishing a window of analysis beginning with said incoming packet and extending for said predetermined number of packets of said sequence following said incoming packet ;
and (c) if only one payload block in said window of analysis has not yet been received , recovering said one payload block by taking an XOR sum of a plurality of payload blocks within said window of analysis .

US5870412A
CLAIM 21
. An apparatus for transmitting payload in a digital communications system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said apparatus comprising , in combination : a first segment for generating , for each payload packet PL k! , a forward error correction code FEC k!=PL k-1! XOR PL k-2! XOR . . . , XOR PL k-w! ;
a second segment for appending said forward error correction code FEC k! to said payload packet PL k! , said payload packet PL k! and forward error correction code FEC k! defining , in combination , a packet P k! ;
and a third segment for transmitting a sequence of said packets from a first location in said digital transmission system to a second location in said digital transmission system , whereby , when said sequence of packets is transmitted from said first location in said digital communications system to said second location in said digital communications system , if a packet P i! is lost in said transmission (last non, first non) , payload packet PL i! may be recreated by extracting PL i! from one or more other packets .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction (concealing frame erasure) code FEC k! comprising the XOR sum of a predetermined number of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (forward error correction) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction (concealing frame erasure) code FEC k! comprising the XOR sum of a predetermined number of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (predetermined number, said transmission, sequence number) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number, said transmission, sequence number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction code FEC k! comprising the XOR sum of a predetermined number (last non, first non) of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US5870412A
CLAIM 6
. A method as claimed in claim 1 , wherein said payload represents a (LP filter excitation signal) real-time media signal .

US5870412A
CLAIM 12
. A method of recovering a lost packet in a sequence of packets transmitted in a telecommunications system , each packet in said sequence defining a sequence number (last non, first non) and carrying a payload block and a redundancy block , said redundancy block in a given packet being defined by an XOR sum of a predetermined number of preceding payload blocks in said sequence , said method comprising , in combination : (a) receiving an incoming packet of said sequence ;
(b) establishing a window of analysis beginning with said incoming packet and extending for said predetermined number of packets of said sequence following said incoming packet ;
and (c) if only one payload block in said window of analysis has not yet been received , recovering said one payload block by taking an XOR sum of a plurality of payload blocks within said window of analysis .

US5870412A
CLAIM 21
. An apparatus for transmitting payload in a digital communications system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said apparatus comprising , in combination : a first segment for generating , for each payload packet PL k! , a forward error correction code FEC k!=PL k-1! XOR PL k-2! XOR . . . , XOR PL k-w! ;
a second segment for appending said forward error correction code FEC k! to said payload packet PL k! , said payload packet PL k! and forward error correction code FEC k! defining , in combination , a packet P k! ;
and a third segment for transmitting a sequence of said packets from a first location in said digital transmission system to a second location in said digital transmission system , whereby , when said sequence of packets is transmitted from said first location in said digital communications system to said second location in said digital communications system , if a packet P i! is lost in said transmission (last non, first non) , payload packet PL i! may be recreated by extracting PL i! from one or more other packets .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (predetermined number, said transmission, sequence number) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction code FEC k! comprising the XOR sum of a predetermined number (last non, first non) of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US5870412A
CLAIM 12
. A method of recovering a lost packet in a sequence of packets transmitted in a telecommunications system , each packet in said sequence defining a sequence number (last non, first non) and carrying a payload block and a redundancy block , said redundancy block in a given packet being defined by an XOR sum of a predetermined number of preceding payload blocks in said sequence , said method comprising , in combination : (a) receiving an incoming packet of said sequence ;
(b) establishing a window of analysis beginning with said incoming packet and extending for said predetermined number of packets of said sequence following said incoming packet ;
and (c) if only one payload block in said window of analysis has not yet been received , recovering said one payload block by taking an XOR sum of a plurality of payload blocks within said window of analysis .

US5870412A
CLAIM 21
. An apparatus for transmitting payload in a digital communications system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said apparatus comprising , in combination : a first segment for generating , for each payload packet PL k! , a forward error correction code FEC k!=PL k-1! XOR PL k-2! XOR . . . , XOR PL k-w! ;
a second segment for appending said forward error correction code FEC k! to said payload packet PL k! , said payload packet PL k! and forward error correction code FEC k! defining , in combination , a packet P k! ;
and a third segment for transmitting a sequence of said packets from a first location in said digital transmission system to a second location in said digital transmission system , whereby , when said sequence of packets is transmitted from said first location in said digital communications system to said second location in said digital communications system , if a packet P i! is lost in said transmission (last non, first non) , payload packet PL i! may be recreated by extracting PL i! from one or more other packets .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (predetermined number, said transmission, sequence number) erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction code FEC k! comprising the XOR sum of a predetermined number (last non, first non) of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US5870412A
CLAIM 12
. A method of recovering a lost packet in a sequence of packets transmitted in a telecommunications system , each packet in said sequence defining a sequence number (last non, first non) and carrying a payload block and a redundancy block , said redundancy block in a given packet being defined by an XOR sum of a predetermined number of preceding payload blocks in said sequence , said method comprising , in combination : (a) receiving an incoming packet of said sequence ;
(b) establishing a window of analysis beginning with said incoming packet and extending for said predetermined number of packets of said sequence following said incoming packet ;
and (c) if only one payload block in said window of analysis has not yet been received , recovering said one payload block by taking an XOR sum of a plurality of payload blocks within said window of analysis .

US5870412A
CLAIM 21
. An apparatus for transmitting payload in a digital communications system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said apparatus comprising , in combination : a first segment for generating , for each payload packet PL k! , a forward error correction code FEC k!=PL k-1! XOR PL k-2! XOR . . . , XOR PL k-w! ;
a second segment for appending said forward error correction code FEC k! to said payload packet PL k! , said payload packet PL k! and forward error correction code FEC k! defining , in combination , a packet P k! ;
and a third segment for transmitting a sequence of said packets from a first location in said digital transmission system to a second location in said digital transmission system , whereby , when said sequence of packets is transmitted from said first location in said digital communications system to said second location in said digital communications system , if a packet P i! is lost in said transmission (last non, first non) , payload packet PL i! may be recreated by extracting PL i! from one or more other packets .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (predetermined number, said transmission, sequence number) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number, said transmission, sequence number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction code FEC k! comprising the XOR sum of a predetermined number (last non, first non) of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US5870412A
CLAIM 12
. A method of recovering a lost packet in a sequence of packets transmitted in a telecommunications system , each packet in said sequence defining a sequence number (last non, first non) and carrying a payload block and a redundancy block , said redundancy block in a given packet being defined by an XOR sum of a predetermined number of preceding payload blocks in said sequence , said method comprising , in combination : (a) receiving an incoming packet of said sequence ;
(b) establishing a window of analysis beginning with said incoming packet and extending for said predetermined number of packets of said sequence following said incoming packet ;
and (c) if only one payload block in said window of analysis has not yet been received , recovering said one payload block by taking an XOR sum of a plurality of payload blocks within said window of analysis .

US5870412A
CLAIM 21
. An apparatus for transmitting payload in a digital communications system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said apparatus comprising , in combination : a first segment for generating , for each payload packet PL k! , a forward error correction code FEC k!=PL k-1! XOR PL k-2! XOR . . . , XOR PL k-w! ;
a second segment for appending said forward error correction code FEC k! to said payload packet PL k! , said payload packet PL k! and forward error correction code FEC k! defining , in combination , a packet P k! ;
and a third segment for transmitting a sequence of said packets from a first location in said digital transmission system to a second location in said digital transmission system , whereby , when said sequence of packets is transmitted from said first location in said digital communications system to said second location in said digital communications system , if a packet P i! is lost in said transmission (last non, first non) , payload packet PL i! may be recreated by extracting PL i! from one or more other packets .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non (predetermined number, said transmission, sequence number) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction code FEC k! comprising the XOR sum of a predetermined number (last non, first non) of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US5870412A
CLAIM 6
. A method as claimed in claim 1 , wherein said payload represents a (LP filter excitation signal) real-time media signal .

US5870412A
CLAIM 12
. A method of recovering a lost packet in a sequence of packets transmitted in a telecommunications system , each packet in said sequence defining a sequence number (last non, first non) and carrying a payload block and a redundancy block , said redundancy block in a given packet being defined by an XOR sum of a predetermined number of preceding payload blocks in said sequence , said method comprising , in combination : (a) receiving an incoming packet of said sequence ;
(b) establishing a window of analysis beginning with said incoming packet and extending for said predetermined number of packets of said sequence following said incoming packet ;
and (c) if only one payload block in said window of analysis has not yet been received , recovering said one payload block by taking an XOR sum of a plurality of payload blocks within said window of analysis .

US5870412A
CLAIM 21
. An apparatus for transmitting payload in a digital communications system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said apparatus comprising , in combination : a first segment for generating , for each payload packet PL k! , a forward error correction code FEC k!=PL k-1! XOR PL k-2! XOR . . . , XOR PL k-w! ;
a second segment for appending said forward error correction code FEC k! to said payload packet PL k! , said payload packet PL k! and forward error correction code FEC k! defining , in combination , a packet P k! ;
and a third segment for transmitting a sequence of said packets from a first location in said digital transmission system to a second location in said digital transmission system , whereby , when said sequence of packets is transmitted from said first location in said digital communications system to said second location in said digital communications system , if a packet P i! is lost in said transmission (last non, first non) , payload packet PL i! may be recreated by extracting PL i! from one or more other packets .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (represents a) produced in the decoder during the received first non (predetermined number, said transmission, sequence number) erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number, said transmission, sequence number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction code FEC k! comprising the XOR sum of a predetermined number (last non, first non) of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US5870412A
CLAIM 6
. A method as claimed in claim 1 , wherein said payload represents a (LP filter excitation signal) real-time media signal .

US5870412A
CLAIM 12
. A method of recovering a lost packet in a sequence of packets transmitted in a telecommunications system , each packet in said sequence defining a sequence number (last non, first non) and carrying a payload block and a redundancy block , said redundancy block in a given packet being defined by an XOR sum of a predetermined number of preceding payload blocks in said sequence , said method comprising , in combination : (a) receiving an incoming packet of said sequence ;
(b) establishing a window of analysis beginning with said incoming packet and extending for said predetermined number of packets of said sequence following said incoming packet ;
and (c) if only one payload block in said window of analysis has not yet been received , recovering said one payload block by taking an XOR sum of a plurality of payload blocks within said window of analysis .

US5870412A
CLAIM 21
. An apparatus for transmitting payload in a digital communications system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said apparatus comprising , in combination : a first segment for generating , for each payload packet PL k! , a forward error correction code FEC k!=PL k-1! XOR PL k-2! XOR . . . , XOR PL k-w! ;
a second segment for appending said forward error correction code FEC k! to said payload packet PL k! , said payload packet PL k! and forward error correction code FEC k! defining , in combination , a packet P k! ;
and a third segment for transmitting a sequence of said packets from a first location in said digital transmission system to a second location in said digital transmission system , whereby , when said sequence of packets is transmitted from said first location in said digital communications system to said second location in said digital communications system , if a packet P i! is lost in said transmission (last non, first non) , payload packet PL i! may be recreated by extracting PL i! from one or more other packets .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non (predetermined number, said transmission, sequence number) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number, said transmission, sequence number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5870412A
CLAIM 1
. A method for communicating payload in a digital transmission system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said method comprising , in combination : to each payload packet PL k! , appending a forward error correction code FEC k! comprising the XOR sum of a predetermined number (last non, first non) of preceding payload packets , said predetermined number being greater than 1 , said payload packet PL k! and said forward error correction code FEC k! defining , in combination , a packet P k! ;
and transmitting a sequence of said packets P k! , P k+1! , . . . , P k+u! from a first device in said digital transmission system for receipt by a second device in said digital transmission system .

US5870412A
CLAIM 6
. A method as claimed in claim 1 , wherein said payload represents a (LP filter excitation signal) real-time media signal .

US5870412A
CLAIM 12
. A method of recovering a lost packet in a sequence of packets transmitted in a telecommunications system , each packet in said sequence defining a sequence number (last non, first non) and carrying a payload block and a redundancy block , said redundancy block in a given packet being defined by an XOR sum of a predetermined number of preceding payload blocks in said sequence , said method comprising , in combination : (a) receiving an incoming packet of said sequence ;
(b) establishing a window of analysis beginning with said incoming packet and extending for said predetermined number of packets of said sequence following said incoming packet ;
and (c) if only one payload block in said window of analysis has not yet been received , recovering said one payload block by taking an XOR sum of a plurality of payload blocks within said window of analysis .

US5870412A
CLAIM 21
. An apparatus for transmitting payload in a digital communications system , said payload being divided into a sequence of payload packets , PL k-w! , . . . , PL k-1! , PL k-2! , PL k! , PL k+1! , . . . , PL k+u! , said apparatus comprising , in combination : a first segment for generating , for each payload packet PL k! , a forward error correction code FEC k!=PL k-1! XOR PL k-2! XOR . . . , XOR PL k-w! ;
a second segment for appending said forward error correction code FEC k! to said payload packet PL k! , said payload packet PL k! and forward error correction code FEC k! defining , in combination , a packet P k! ;
and a third segment for transmitting a sequence of said packets from a first location in said digital transmission system to a second location in said digital transmission system , whereby , when said sequence of packets is transmitted from said first location in said digital communications system to said second location in said digital communications system , if a packet P i! is lost in said transmission (last non, first non) , payload packet PL i! may be recreated by extracting PL i! from one or more other packets .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
WO9827543A2

Filed: 1997-12-05     Issued: 1998-06-25

Multi-feature speech/music discrimination system

(Original Assignee) Interval Research Corporation     

Eric D. Scheirer, Malcolm Slaney
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (successive frames) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (signal samples) of the low-pass filter each with a distance corresponding to an average pitch (modulation frequencies) value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
WO9827543A2
CLAIM 1
. A method for discriminating between speech and music content in an audio signal , comprising the steps of : selecting a set of audio signal samples (impulse responses) ;
measuring values for a plurality of features in each sample of said set of samples ;
defining a multi-dimensional feature space containing data points which respectively correspond to the measured feature values for each sample , and labelling each data point as relating to speech or music ;
measuring feature values for a test sample of an audio signal and determining a corresponding data point in said feature space ;
determining the label for at least one data point in said feature space which is close to the data point corresponding to said test sample ;
and classifying the test sample in accordance with the determined label .

WO9827543A2
CLAIM 8
. The method of claim 1 wherein one of said features is the proportion of energy in the audio signal having speech modulation frequencies (average pitch, average pitch value) .

WO9827543A2
CLAIM 11
. The method of claim 1 wherein said audio signal is divided into a sequence of frames and further including the steps of classifying each frame of the test sample as relating to speech or music , examining the classifications for a plurality of successive frames (onset frame) , and determining a final classification on the basis of the examined classifications .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (dimensional feature space, feature values) per sample for other frames .
WO9827543A2
CLAIM 1
. A method for discriminating between speech and music content in an audio signal , comprising the steps of : selecting a set of audio signal samples ;
measuring values for a plurality of features in each sample of said set of samples ;
defining a multi-dimensional feature space (average energy) containing data points which respectively correspond to the measured feature values (average energy) for each sample , and labelling each data point as relating to speech or music ;
measuring feature values for a test sample of an audio signal and determining a corresponding data point in said feature space ;
determining the label for at least one data point in said feature space which is close to the data point corresponding to said test sample ;
and classifying the test sample in accordance with the determined label .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (successive frames) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (signal samples) of the low-pass filter each with a distance corresponding to an average pitch (modulation frequencies) value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
WO9827543A2
CLAIM 1
. A method for discriminating between speech and music content in an audio signal , comprising the steps of : selecting a set of audio signal samples (impulse responses) ;
measuring values for a plurality of features in each sample of said set of samples ;
defining a multi-dimensional feature space containing data points which respectively correspond to the measured feature values for each sample , and labelling each data point as relating to speech or music ;
measuring feature values for a test sample of an audio signal and determining a corresponding data point in said feature space ;
determining the label for at least one data point in said feature space which is close to the data point corresponding to said test sample ;
and classifying the test sample in accordance with the determined label .

WO9827543A2
CLAIM 8
. The method of claim 1 wherein one of said features is the proportion of energy in the audio signal having speech modulation frequencies (average pitch, average pitch value) .

WO9827543A2
CLAIM 11
. The method of claim 1 wherein said audio signal is divided into a sequence of frames and further including the steps of classifying each frame of the test sample as relating to speech or music , examining the classifications for a plurality of successive frames (onset frame) , and determining a final classification on the basis of the examined classifications .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (dimensional feature space, feature values) per sample for other frames .
WO9827543A2
CLAIM 1
. A method for discriminating between speech and music content in an audio signal , comprising the steps of : selecting a set of audio signal samples ;
measuring values for a plurality of features in each sample of said set of samples ;
defining a multi-dimensional feature space (average energy) containing data points which respectively correspond to the measured feature values (average energy) for each sample , and labelling each data point as relating to speech or music ;
measuring feature values for a test sample of an audio signal and determining a corresponding data point in said feature space ;
determining the label for at least one data point in said feature space which is close to the data point corresponding to said test sample ;
and classifying the test sample in accordance with the determined label .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (dimensional feature space, feature values) per sample for other frames .
WO9827543A2
CLAIM 1
. A method for discriminating between speech and music content in an audio signal , comprising the steps of : selecting a set of audio signal samples ;
measuring values for a plurality of features in each sample of said set of samples ;
defining a multi-dimensional feature space (average energy) containing data points which respectively correspond to the measured feature values (average energy) for each sample , and labelling each data point as relating to speech or music ;
measuring feature values for a test sample of an audio signal and determining a corresponding data point in said feature space ;
determining the label for at least one data point in said feature space which is close to the data point corresponding to said test sample ;
and classifying the test sample in accordance with the determined label .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6199037B1

Filed: 1997-12-04     Issued: 2001-03-06

Joint quantization of speech subframe voicing metrics and fundamental frequencies

(Original Assignee) Digital Voice Systems Inc     (Current Assignee) Digital Voice Systems Inc

John C. Hardwick
US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6199037B1
CLAIM 1
. A method of encoding speech into a frame of bits , the method comprising : digitizing a speech signal (speech signal, decoder determines concealment) into a sequence of digital speech samples ;
dividing the digital speech samples into a sequence of subframes , each of the subframes including multiple digital speech samples ;
estimating a fundamental frequency parameter for each subframe ;
designating subframes from the sequence of subframes as corresponding to a frame ;
jointly quantizing fundamental frequency parameters from subframes of the frame to produce a set of encoder fundamental frequency bits ;
and including the encoder fundamental frequency bits in a frame of bits , wherein the joint quantization comprises : computing fundamental frequency residual parameters as a difference between a transformed average of the fundamental frequency parameters and each fundamental frequency parameter ;
combining the residual fundamental frequency parameters from the subframes of the frame ;
and quantizing the combined residual parameters .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6199037B1
CLAIM 20
. A speech encoder (last frame, replacement frame) for encoding speech into a frame of bits , the encoder comprising : means for digitizing a speech signal into a sequence of digital speech samples ;
means for estimating a set of voicing metrics parameters for a group of digital speech samples , the set including multiple voicing metrics parameters ;
means for jointly quantizing the voicing metrics parameters to produce a set of encoder voicing metrics bits ;
and means for forming a frame of bits including the encoder voicing metrics bits .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6199037B1
CLAIM 1
. A method of encoding speech into a frame of bits , the method comprising : digitizing a speech signal (speech signal, decoder determines concealment) into a sequence of digital speech samples ;
dividing the digital speech samples into a sequence of subframes , each of the subframes including multiple digital speech samples ;
estimating a fundamental frequency parameter for each subframe ;
designating subframes from the sequence of subframes as corresponding to a frame ;
jointly quantizing fundamental frequency parameters from subframes of the frame to produce a set of encoder fundamental frequency bits ;
and including the encoder fundamental frequency bits in a frame of bits , wherein the joint quantization comprises : computing fundamental frequency residual parameters as a difference between a transformed average of the fundamental frequency parameters and each fundamental frequency parameter ;
combining the residual fundamental frequency parameters from the subframes of the frame ;
and quantizing the combined residual parameters .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6199037B1
CLAIM 1
. A method of encoding speech into a frame of bits , the method comprising : digitizing a speech signal (speech signal, decoder determines concealment) into a sequence of digital speech samples ;
dividing the digital speech samples into a sequence of subframes , each of the subframes including multiple digital speech samples ;
estimating a fundamental frequency parameter for each subframe ;
designating subframes from the sequence of subframes as corresponding to a frame ;
jointly quantizing fundamental frequency parameters from subframes of the frame to produce a set of encoder fundamental frequency bits ;
and including the encoder fundamental frequency bits in a frame of bits , wherein the joint quantization comprises : computing fundamental frequency residual parameters as a difference between a transformed average of the fundamental frequency parameters and each fundamental frequency parameter ;
combining the residual fundamental frequency parameters from the subframes of the frame ;
and quantizing the combined residual parameters .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6199037B1
CLAIM 20
. A speech encoder (last frame, replacement frame) for encoding speech into a frame of bits , the encoder comprising : means for digitizing a speech signal into a sequence of digital speech samples ;
means for estimating a set of voicing metrics parameters for a group of digital speech samples , the set including multiple voicing metrics parameters ;
means for jointly quantizing the voicing metrics parameters to produce a set of encoder voicing metrics bits ;
and means for forming a frame of bits including the encoder voicing metrics bits .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6199037B1
CLAIM 20
. A speech encoder (last frame, replacement frame) for encoding speech into a frame of bits , the encoder comprising : means for digitizing a speech signal into a sequence of digital speech samples ;
means for estimating a set of voicing metrics parameters for a group of digital speech samples , the set including multiple voicing metrics parameters ;
means for jointly quantizing the voicing metrics parameters to produce a set of encoder voicing metrics bits ;
and means for forming a frame of bits including the encoder voicing metrics bits .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6199037B1
CLAIM 1
. A method of encoding speech into a frame of bits , the method comprising : digitizing a speech signal (speech signal, decoder determines concealment) into a sequence of digital speech samples ;
dividing the digital speech samples into a sequence of subframes , each of the subframes including multiple digital speech samples ;
estimating a fundamental frequency parameter for each subframe ;
designating subframes from the sequence of subframes as corresponding to a frame ;
jointly quantizing fundamental frequency parameters from subframes of the frame to produce a set of encoder fundamental frequency bits ;
and including the encoder fundamental frequency bits in a frame of bits , wherein the joint quantization comprises : computing fundamental frequency residual parameters as a difference between a transformed average of the fundamental frequency parameters and each fundamental frequency parameter ;
combining the residual fundamental frequency parameters from the subframes of the frame ;
and quantizing the combined residual parameters .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6199037B1
CLAIM 20
. A speech encoder (last frame, replacement frame) for encoding speech into a frame of bits , the encoder comprising : means for digitizing a speech signal into a sequence of digital speech samples ;
means for estimating a set of voicing metrics parameters for a group of digital speech samples , the set including multiple voicing metrics parameters ;
means for jointly quantizing the voicing metrics parameters to produce a set of encoder voicing metrics bits ;
and means for forming a frame of bits including the encoder voicing metrics bits .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6199037B1
CLAIM 1
. A method of encoding speech into a frame of bits , the method comprising : digitizing a speech signal (speech signal, decoder determines concealment) into a sequence of digital speech samples ;
dividing the digital speech samples into a sequence of subframes , each of the subframes including multiple digital speech samples ;
estimating a fundamental frequency parameter for each subframe ;
designating subframes from the sequence of subframes as corresponding to a frame ;
jointly quantizing fundamental frequency parameters from subframes of the frame to produce a set of encoder fundamental frequency bits ;
and including the encoder fundamental frequency bits in a frame of bits , wherein the joint quantization comprises : computing fundamental frequency residual parameters as a difference between a transformed average of the fundamental frequency parameters and each fundamental frequency parameter ;
combining the residual fundamental frequency parameters from the subframes of the frame ;
and quantizing the combined residual parameters .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6199037B1
CLAIM 1
. A method of encoding speech into a frame of bits , the method comprising : digitizing a speech signal (speech signal, decoder determines concealment) into a sequence of digital speech samples ;
dividing the digital speech samples into a sequence of subframes , each of the subframes including multiple digital speech samples ;
estimating a fundamental frequency parameter for each subframe ;
designating subframes from the sequence of subframes as corresponding to a frame ;
jointly quantizing fundamental frequency parameters from subframes of the frame to produce a set of encoder fundamental frequency bits ;
and including the encoder fundamental frequency bits in a frame of bits , wherein the joint quantization comprises : computing fundamental frequency residual parameters as a difference between a transformed average of the fundamental frequency parameters and each fundamental frequency parameter ;
combining the residual fundamental frequency parameters from the subframes of the frame ;
and quantizing the combined residual parameters .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6199037B1
CLAIM 20
. A speech encoder (last frame, replacement frame) for encoding speech into a frame of bits , the encoder comprising : means for digitizing a speech signal into a sequence of digital speech samples ;
means for estimating a set of voicing metrics parameters for a group of digital speech samples , the set including multiple voicing metrics parameters ;
means for jointly quantizing the voicing metrics parameters to produce a set of encoder voicing metrics bits ;
and means for forming a frame of bits including the encoder voicing metrics bits .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6199037B1
CLAIM 1
. A method of encoding speech into a frame of bits , the method comprising : digitizing a speech signal (speech signal, decoder determines concealment) into a sequence of digital speech samples ;
dividing the digital speech samples into a sequence of subframes , each of the subframes including multiple digital speech samples ;
estimating a fundamental frequency parameter for each subframe ;
designating subframes from the sequence of subframes as corresponding to a frame ;
jointly quantizing fundamental frequency parameters from subframes of the frame to produce a set of encoder fundamental frequency bits ;
and including the encoder fundamental frequency bits in a frame of bits , wherein the joint quantization comprises : computing fundamental frequency residual parameters as a difference between a transformed average of the fundamental frequency parameters and each fundamental frequency parameter ;
combining the residual fundamental frequency parameters from the subframes of the frame ;
and quantizing the combined residual parameters .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6199037B1
CLAIM 20
. A speech encoder (last frame, replacement frame) for encoding speech into a frame of bits , the encoder comprising : means for digitizing a speech signal into a sequence of digital speech samples ;
means for estimating a set of voicing metrics parameters for a group of digital speech samples , the set including multiple voicing metrics parameters ;
means for jointly quantizing the voicing metrics parameters to produce a set of encoder voicing metrics bits ;
and means for forming a frame of bits including the encoder voicing metrics bits .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH10190498A

Filed: 1997-11-14     Issued: 1998-07-21

不連続伝送中に快適雑音を発生させる改善された方法

(Original Assignee) Nokia Mobile Phones Ltd; ノキア モービル フォーンズ リミテッド     

Kari Jarvinen, Pekka Kapanen, Jani Rotola-Pukkila, Vesa Ruoppila, ヤルビネン カリ, ルオッピラ ベサ, カパネン ペッカ, ロトラ−プッキラ ヤニ
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (フレーム間) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (短期間) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (の平均) response (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) of the low-pass filter each with a distance corresponding to an average pitch value (それぞれ独立) from the preceding impulse response (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) up to the end of a last subframe affected by the artificial construction of the periodic part .
JPH10190498A
CLAIM 25
【請求項25】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を発生する方法であ って、 音声が途絶えたときに、音声符号化パラメータの集合を バッファリングするステップと、 平均化期間において、上記集合の中の背景雑音を良好に 表していない音声符号化パラメータを、背景雑音を良好 に表す音声符号化パラメータで置換するステップと、 上記音声符号化パラメータの集合の平均 (first impulse) を求めるステッ プとを含む方法。

JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間 (onset frame) で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 28
【請求項28】 上記の平均を求めるステップが、平均 励起信号利得g mean と平均短期間 (pitch period) スペクトル係数f mean (i)を計算するステップとを含む請求項25記載の方 法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 48
【請求項48】 上記データ処理手段が、音声符号化パ ラメータの励起信号利得値gとラインスペクトル対(L SP)ベクトルf i に対してそれぞれ独立 (average pitch value) に識別と置換 を行う請求項45記載の装置。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) , and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) , and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (短期間) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 28
【請求項28】 上記の平均を求めるステップが、平均 励起信号利得g mean と平均短期間 (pitch period) スペクトル係数f mean (i)を計算するステップとを含む請求項25記載の方 法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) , and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames (有するフレーム) .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 41
【請求項41】 平均化期間内の各フレームに対する距 離ΔS i を求めた後、さらに、 上記距離をその値の順に整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するフレーム (other frames) を、上記平均化 期間の中央フレームとし、その距離をΔS med 、またそ の音声符号化パラメータをg med およびf med で表すステ ップとを実行する請求項40記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) , and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) , and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (短期間) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 28
【請求項28】 上記の平均を求めるステップが、平均 励起信号利得g mean と平均短期間 (pitch period) スペクトル係数f mean (i)を計算するステップとを含む請求項25記載の方 法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (フレーム間) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (短期間) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (の平均) response (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) of the low-pass filter each with a distance corresponding to an average pitch value (それぞれ独立) from the preceding impulse response (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) up to an end of a last subframe affected by the artificial construction of the periodic part .
JPH10190498A
CLAIM 25
【請求項25】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を発生する方法であ って、 音声が途絶えたときに、音声符号化パラメータの集合を バッファリングするステップと、 平均化期間において、上記集合の中の背景雑音を良好に 表していない音声符号化パラメータを、背景雑音を良好 に表す音声符号化パラメータで置換するステップと、 上記音声符号化パラメータの集合の平均 (first impulse) を求めるステッ プとを含む方法。

JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間 (onset frame) で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 28
【請求項28】 上記の平均を求めるステップが、平均 励起信号利得g mean と平均短期間 (pitch period) スペクトル係数f mean (i)を計算するステップとを含む請求項25記載の方 法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 48
【請求項48】 上記データ処理手段が、音声符号化パ ラメータの励起信号利得値gとラインスペクトル対(L SP)ベクトルf i に対してそれぞれ独立 (average pitch value) に識別と置換 を行う請求項45記載の装置。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (短期間) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 28
【請求項28】 上記の平均を求めるステップが、平均 励起信号利得g mean と平均短期間 (pitch period) スペクトル係数f mean (i)を計算するステップとを含む請求項25記載の方 法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames (有するフレーム) .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 41
【請求項41】 平均化期間内の各フレームに対する距 離ΔS i を求めた後、さらに、 上記距離をその値の順に整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するフレーム (other frames) を、上記平均化 期間の中央フレームとし、その距離をΔS med 、またそ の音声符号化パラメータをg med およびf med で表すステ ップとを実行する請求項40記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (短期間) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 28
【請求項28】 上記の平均を求めるステップが、平均 励起信号利得g mean と平均短期間 (pitch period) スペクトル係数f mean (i)を計算するステップとを含む請求項25記載の方 法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames (有するフレーム) .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 41
【請求項41】 平均化期間内の各フレームに対する距 離ΔS i を求めた後、さらに、 上記距離をその値の順に整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するフレーム (other frames) を、上記平均化 期間の中央フレームとし、その距離をΔS med 、またそ の音声符号化パラメータをg med およびf med で表すステ ップとを実行する請求項40記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter (有するベクトル, 周波数応答, 応答特性, 他のパラメータ) related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JPH10190498A
CLAIM 26
【請求項26】 上記ステップに含まれる置換をするス テップが、 上記平均化期間内の個々のフレーム間で、互いの音声符 号化パラメータの距離を測定するステップと、 平均化期間内において、他のパラメータ (first impulse response, impulse responses, preceding impulse response, phase information parameter) からの距離が最 大である音声符号化パラメータを識別するステップと、 もし、上記距離が、あらかじめ定められた閾値よりも大 きい場合には、上記の識別された音声符号化パラメータ を、上記平均化期間内において他の音声符号化パラメー タに対する測定距離が最小である音声符号化パラメータ と置換するステップととを含む請求項25記載の方法。

JPH10190498A
CLAIM 35
【請求項35】 平均化期間における各LSPベクトル f i に対するスペクトル距離ΔS i を求めた後に、その値 に従ってスペクトル距離を整列するステップと、 上記平均化期間(i=1、2、・・・、N)内において 最も小さい距離ΔS i を有するベクトル (first impulse response, impulse responses, preceding impulse response, phase information parameter) f i を、上記平均 化期間の中央ベクトルf med とし、その距離をΔS med で 表すステップと、 P(0≦P≦N−1)個のLSPベクトルf i の中央値を 上記中央ベクトルf med で置換するステップとをさらに 請求項33記載の方法。

JPH10190498A
CLAIM 50
【請求項50】 不連続送信方式を用いたディジタル移 動端末において快適化雑音(CN)を生成する方法であ って、 音声が途絶えたときに、CNパラメータを受信機に対し て送信するステップと、 励起信号のスペクトル内容の整形を以下のステップ、 白色雑音励起信号列から励起信号を形成するステップ、 上記白色雑音励起信号列をスケーリングして、スケーリ ングされた雑音信号列を生成するステップ、および上記 スケーリングされた雑音信号列を、所望の快適化雑音特 性を有するようにするか、あるいは、送信された係数を 有するランダム励起スペクトル制御(RESC)フィル タの周波数応答 (first impulse response, impulse responses, preceding impulse response, phase information parameter) 特性に類似の周波数応答特性を有するか の少なくともどちらかとなるように最適化 (energy information parameter) された固定係 数を有する合成フィルタによって、処理するステップを 有するステップとによって行う方法。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5960389A

Filed: 1997-11-06     Issued: 1999-09-28

Methods for generating comfort noise during discontinuous transmission

(Original Assignee) Nokia Mobile Phones Ltd     (Current Assignee) Nokia Technologies Oy

Kari Jarvinen, Pekka Kapanen, Vesa Ruoppila, Jani Rotola-Pukkila
US7693710B2
CLAIM 1
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (frequency response) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (frequency response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US5960389A
CLAIM 56
. A method for producing comfort noise (CN) , comprising the steps of : in response to a speech pause , transmitting CN parameters to a receiver ;
and shaping the spectral content of an excitation by steps of , forming an excitation from a white noise excitation sequence ;
scaling the white noise excitation sequence to produce a scaled white noise excitation sequence ;
and processing the scaled white noise excitation sequence in a synthesis filter having fixed coefficients that are optimized to provide at least one of a desired comfort noise quality or to cause the frequency response (first impulse, impulse responses) of the synthesis filter to resemble that of a random excitation spectral control (RESC) filter having transmitted coefficients .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (PC mode) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (PC mode) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (CN parameters, comfort noise) and the first non erased frame received after frame erasure is encoded as active speech .
US5960389A
CLAIM 1
. A method for producing comfort noise (comfort noise, decoder determines concealment) (CN) in a digital mobile terminal that uses a discontinuous transmission , comprising the steps of : in response to a speech pause , calculating random excitation spectral control (RESC) parameters ;
transmitting the RESC parameters to a receiver together with predetermined ones of CN parameters (comfort noise, decoder determines concealment) ;
receiving the RESC parameters ;
and shaping the spectral content of an excitation using the received RESC parameters prior to applying the excitation to a synthesis filter .

US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (continuous transmission, background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5960389A
CLAIM 1
. A method for producing comfort noise (CN) in a digital mobile terminal that uses a discontinuous transmission (LP filter) , comprising the steps of : in response to a speech pause , calculating random excitation spectral control (RESC) parameters ;
transmitting the RESC parameters to a receiver together with predetermined ones of CN parameters ;
receiving the RESC parameters ;
and shaping the spectral content of an excitation using the received RESC parameters prior to applying the excitation to a synthesis filter .

US5960389A
CLAIM 31
. A method for generating comfort noise (CN) in a digital mobile terminal that uses a discontinuous transmission , comprising the steps of : in response to a speech pause , buffering a set of speech coding parameters ;
within an averaging period , replacing speech coding parameters of the set that are not representative of background noise (LP filter) with speech coding parameters that are representative of the background noise ;
and averaging the set of speech coding parameters .

US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (continuous transmission, background noise) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (PC mode) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5960389A
CLAIM 1
. A method for producing comfort noise (CN) in a digital mobile terminal that uses a discontinuous transmission (LP filter) , comprising the steps of : in response to a speech pause , calculating random excitation spectral control (RESC) parameters ;
transmitting the RESC parameters to a receiver together with predetermined ones of CN parameters ;
receiving the RESC parameters ;
and shaping the spectral content of an excitation using the received RESC parameters prior to applying the excitation to a synthesis filter .

US5960389A
CLAIM 31
. A method for generating comfort noise (CN) in a digital mobile terminal that uses a discontinuous transmission , comprising the steps of : in response to a speech pause , buffering a set of speech coding parameters ;
within an averaging period , replacing speech coding parameters of the set that are not representative of background noise (LP filter) with speech coding parameters that are representative of the background noise ;
and averaging the set of speech coding parameters .

US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (PC mode) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (continuous transmission, background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5960389A
CLAIM 1
. A method for producing comfort noise (CN) in a digital mobile terminal that uses a discontinuous transmission (LP filter) , comprising the steps of : in response to a speech pause , calculating random excitation spectral control (RESC) parameters ;
transmitting the RESC parameters to a receiver together with predetermined ones of CN parameters ;
receiving the RESC parameters ;
and shaping the spectral content of an excitation using the received RESC parameters prior to applying the excitation to a synthesis filter .

US5960389A
CLAIM 31
. A method for generating comfort noise (CN) in a digital mobile terminal that uses a discontinuous transmission , comprising the steps of : in response to a speech pause , buffering a set of speech coding parameters ;
within an averaging period , replacing speech coding parameters of the set that are not representative of background noise (LP filter) with speech coding parameters that are representative of the background noise ;
and averaging the set of speech coding parameters .

US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (frequency response) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (frequency response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US5960389A
CLAIM 56
. A method for producing comfort noise (CN) , comprising the steps of : in response to a speech pause , transmitting CN parameters to a receiver ;
and shaping the spectral content of an excitation by steps of , forming an excitation from a white noise excitation sequence ;
scaling the white noise excitation sequence to produce a scaled white noise excitation sequence ;
and processing the scaled white noise excitation sequence in a synthesis filter having fixed coefficients that are optimized to provide at least one of a desired comfort noise quality or to cause the frequency response (first impulse, impulse responses) of the synthesis filter to resemble that of a random excitation spectral control (RESC) filter having transmitted coefficients .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (PC mode) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (PC mode) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (CN parameters, comfort noise) and the first non erased frame received after frame erasure is encoded as active speech .
US5960389A
CLAIM 1
. A method for producing comfort noise (comfort noise, decoder determines concealment) (CN) in a digital mobile terminal that uses a discontinuous transmission , comprising the steps of : in response to a speech pause , calculating random excitation spectral control (RESC) parameters ;
transmitting the RESC parameters to a receiver together with predetermined ones of CN parameters (comfort noise, decoder determines concealment) ;
receiving the RESC parameters ;
and shaping the spectral content of an excitation using the received RESC parameters prior to applying the excitation to a synthesis filter .

US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (continuous transmission, background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5960389A
CLAIM 1
. A method for producing comfort noise (CN) in a digital mobile terminal that uses a discontinuous transmission (LP filter) , comprising the steps of : in response to a speech pause , calculating random excitation spectral control (RESC) parameters ;
transmitting the RESC parameters to a receiver together with predetermined ones of CN parameters ;
receiving the RESC parameters ;
and shaping the spectral content of an excitation using the received RESC parameters prior to applying the excitation to a synthesis filter .

US5960389A
CLAIM 31
. A method for generating comfort noise (CN) in a digital mobile terminal that uses a discontinuous transmission , comprising the steps of : in response to a speech pause , buffering a set of speech coding parameters ;
within an averaging period , replacing speech coding parameters of the set that are not representative of background noise (LP filter) with speech coding parameters that are representative of the background noise ;
and averaging the set of speech coding parameters .

US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (continuous transmission, background noise) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (PC mode) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5960389A
CLAIM 1
. A method for producing comfort noise (CN) in a digital mobile terminal that uses a discontinuous transmission (LP filter) , comprising the steps of : in response to a speech pause , calculating random excitation spectral control (RESC) parameters ;
transmitting the RESC parameters to a receiver together with predetermined ones of CN parameters ;
receiving the RESC parameters ;
and shaping the spectral content of an excitation using the received RESC parameters prior to applying the excitation to a synthesis filter .

US5960389A
CLAIM 31
. A method for generating comfort noise (CN) in a digital mobile terminal that uses a discontinuous transmission , comprising the steps of : in response to a speech pause , buffering a set of speech coding parameters ;
within an averaging period , replacing speech coding parameters of the set that are not representative of background noise (LP filter) with speech coding parameters that are representative of the background noise ;
and averaging the set of speech coding parameters .

US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (PC mode) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (continuous transmission, background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5960389A
CLAIM 1
. A method for producing comfort noise (CN) in a digital mobile terminal that uses a discontinuous transmission (LP filter) , comprising the steps of : in response to a speech pause , calculating random excitation spectral control (RESC) parameters ;
transmitting the RESC parameters to a receiver together with predetermined ones of CN parameters ;
receiving the RESC parameters ;
and shaping the spectral content of an excitation using the received RESC parameters prior to applying the excitation to a synthesis filter .

US5960389A
CLAIM 31
. A method for generating comfort noise (CN) in a digital mobile terminal that uses a discontinuous transmission , comprising the steps of : in response to a speech pause , buffering a set of speech coding parameters ;
within an averaging period , replacing speech coding parameters of the set that are not representative of background noise (LP filter) with speech coding parameters that are representative of the background noise ;
and averaging the set of speech coding parameters .

US5960389A
CLAIM 38
. A method as in claim 37 , where the step of determining the spectral distance is accomplished in accordance with the expression ##EQU24## where M is the degree of the LPC mode (frame erasure) l , and f i (k) is the kth LSP parameter of the ith frame in the averaging period .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5884253A

Filed: 1997-10-03     Issued: 1999-03-16

Prototype waveform speech coding with interpolation of pitch, pitch-period waveforms, and synthesis filter

(Original Assignee) Nokia of America Corp     (Current Assignee) Nokia of America Corp

Willem Bastiaan Kleijn
US7693710B2
CLAIM 1
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (first speech) is classified as onset , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (first speech) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (first speech) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (first speech) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 13
. A device for conducting concealment (communications channel) of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel (conducting concealment) , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 14
. A device for conducting concealment (communications channel) of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel (conducting concealment) , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 15
. A device for conducting concealment (communications channel) of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel (conducting concealment) , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 16
. A device for conducting concealment (communications channel) of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel (conducting concealment) , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 17
. A device for conducting concealment (communications channel) of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel (conducting concealment) , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (first speech) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (second residual signal, first residual signal) , limits to a given value a gain used for scaling the synthesized sound signal .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (first speech) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 20
. A device for conducting concealment (communications channel) of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (second residual signal, first residual signal) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel (conducting concealment) , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (first speech) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 22
. A device for conducting concealment (communications channel) of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel (conducting concealment) , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 23
. A device for conducting concealment (communications channel) of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel (conducting concealment) , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 24
. A device for conducting concealment (communications channel) of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel (conducting concealment) , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (first speech) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery (second residual signal, first residual signal) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5884253A
CLAIM 1
. A method of synthesizing a speech signal based on signals communicated via a communications channel , the method comprising the steps of : receiving at least two communicated signals , including (i) a first communicated signal comprising a first pitch-period and a first set of frequency domain parameters , the first set of frequency domain parameters representing a first residual signal (decoder concealment, decoder recovery) representative of a first speech (frame erasure, concealing frame erasure) signal segment of a length equal to said first pitch-period , and (ii) a second communicated signal comprising a second pitch-period and a second set of frequency domain parameters , the second set of frequency domain parameters representing a second residual signal (decoder concealment, decoder recovery) representative of a second speech signal segment of a length equal to said second pitch-period ;
interpolating between the first pitch-period and the second pitch-period to generate an interpolated pitch-period ;
interpolating between the first set of frequency domain parameters and the second set of frequency domain parameters to generate a set of interpolated frequency domain parameters ;
generating a reconstructed residual signal based on said set of interpolated frequency domain parameters and on said interpolated pitch-period , the reconstructed residual signal representing an interpolated speech signal segment of a length equal to said interpolated pitch-period ;
and synthesizing the speech signal based on the reconstructed residual signal .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
WO9912155A1

Filed: 1997-09-30     Issued: 1999-03-11

Channel gain modification system and method for noise reduction in voice communication

(Original Assignee) Qualcomm Incorporated     

Anthony P. Mauro
US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
WO9912155A1
CLAIM 1
. A noise suppressor for suppressing the background noise (LP filter) of an audio signal , comprising : a signal to noise ratio (SNR) estimator for generating channel SNR estimates for a first predefined set of frequency channels of said audio signal ;
a gain estimator for generating a gain factor for each of said frequency channels based on a corresponding one of said channel SNR estimates , wherein said gain factor is derived using a gain function which defines gain factor as an increasing function of SNR ;
and a gain adjuster for adjusting the gain level of each of said frequency channels based on said corresponding gain factor .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (background noise) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
WO9912155A1
CLAIM 1
. A noise suppressor for suppressing the background noise (LP filter) of an audio signal , comprising : a signal to noise ratio (SNR) estimator for generating channel SNR estimates for a first predefined set of frequency channels of said audio signal ;
a gain estimator for generating a gain factor for each of said frequency channels based on a corresponding one of said channel SNR estimates , wherein said gain factor is derived using a gain function which defines gain factor as an increasing function of SNR ;
and a gain adjuster for adjusting the gain level of each of said frequency channels based on said corresponding gain factor .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
WO9912155A1
CLAIM 1
. A noise suppressor for suppressing the background noise (LP filter) of an audio signal , comprising : a signal to noise ratio (SNR) estimator for generating channel SNR estimates for a first predefined set of frequency channels of said audio signal ;
a gain estimator for generating a gain factor for each of said frequency channels based on a corresponding one of said channel SNR estimates , wherein said gain factor is derived using a gain function which defines gain factor as an increasing function of SNR ;
and a gain adjuster for adjusting the gain level of each of said frequency channels based on said corresponding gain factor .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
WO9912155A1
CLAIM 1
. A noise suppressor for suppressing the background noise (LP filter) of an audio signal , comprising : a signal to noise ratio (SNR) estimator for generating channel SNR estimates for a first predefined set of frequency channels of said audio signal ;
a gain estimator for generating a gain factor for each of said frequency channels based on a corresponding one of said channel SNR estimates , wherein said gain factor is derived using a gain function which defines gain factor as an increasing function of SNR ;
and a gain adjuster for adjusting the gain level of each of said frequency channels based on said corresponding gain factor .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (background noise) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
WO9912155A1
CLAIM 1
. A noise suppressor for suppressing the background noise (LP filter) of an audio signal , comprising : a signal to noise ratio (SNR) estimator for generating channel SNR estimates for a first predefined set of frequency channels of said audio signal ;
a gain estimator for generating a gain factor for each of said frequency channels based on a corresponding one of said channel SNR estimates , wherein said gain factor is derived using a gain function which defines gain factor as an increasing function of SNR ;
and a gain adjuster for adjusting the gain level of each of said frequency channels based on said corresponding gain factor .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (background noise) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
WO9912155A1
CLAIM 1
. A noise suppressor for suppressing the background noise (LP filter) of an audio signal , comprising : a signal to noise ratio (SNR) estimator for generating channel SNR estimates for a first predefined set of frequency channels of said audio signal ;
a gain estimator for generating a gain factor for each of said frequency channels based on a corresponding one of said channel SNR estimates , wherein said gain factor is derived using a gain function which defines gain factor as an increasing function of SNR ;
and a gain adjuster for adjusting the gain level of each of said frequency channels based on said corresponding gain factor .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5909663A

Filed: 1997-09-05     Issued: 1999-06-01

Speech decoding method and apparatus for selecting random noise codevectors as excitation signals for an unvoiced speech frame

(Original Assignee) Sony Corp     (Current Assignee) Sony Corp

Kazuyuki Iijima, Masayuki Nishiguchi, Jun Matsumoto
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (detecting step) per sample for other frames .
US5909663A
CLAIM 1
. A speech decoding method for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said method comprising : a waveform-decoding step for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
an error detecting step (average energy) for detecting an error using an error checking code appended to said encoded speech signal ;
and an evading step for evading repeated use of a same waveform as a waveform used in said waveform-decoding step by using a waveform different from a directly preceding waveform when an error is detected in said error detecting step .

US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (noise component) is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (decoding apparatus) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (noise component) is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (noise component) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs (decoding apparatus) , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (detecting step) per sample for other frames .
US5909663A
CLAIM 1
. A speech decoding method for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said method comprising : a waveform-decoding step for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
an error detecting step (average energy) for detecting an error using an error checking code appended to said encoded speech signal ;
and an evading step for evading repeated use of a same waveform as a waveform used in said waveform-decoding step by using a waveform different from a directly preceding waveform when an error is detected in said error detecting step .

US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (noise component) is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (decoding apparatus) , limits to a given value a gain used for scaling the synthesized sound signal .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (noise component) is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (noise component) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (detecting step) per sample for other frames .
US5909663A
CLAIM 1
. A speech decoding method for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said method comprising : a waveform-decoding step for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
an error detecting step (average energy) for detecting an error using an error checking code appended to said encoded speech signal ;
and an evading step for evading repeated use of a same waveform as a waveform used in said waveform-decoding step by using a waveform different from a directly preceding waveform when an error is detected in said error detecting step .

US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (noise component) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery (decoding apparatus) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5909663A
CLAIM 3
. The speech decoding method as claimed in claim 1 , wherein noise component (sound signal) s are added to said excitation signal in said evading step for evading repeated use of said same waveform .

US5909663A
CLAIM 7
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal produced by dividing an input speech signal on a time axis using a pre-set encoding unit and by waveform-encoding a resulting encoding-unit-based time-axis waveform signal , said apparatus comprising : waveform-decoding means for waveform-decoding said encoded speech signal and for producing an encoding-unit-based time-axis waveform signal , wherein said time-axis waveform signal is an excitation signal for synthesis of an unvoiced speech signal ;
error detection means for detecting an error using an error checking code appended to said encoded speech signal ;
and evading means for evading repeated use of a same waveform as a waveform used by said waveform-decoding means by using a waveform different from a directly-preceding waveform when an error is detected by said error detection means .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
EP0834863A2

Filed: 1997-08-26     Issued: 1998-04-08

Speech coder at low bit rates

(Original Assignee) NEC Corp     (Current Assignee) NEC Corp

Ozawa Kazunori
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse responses, second pulses) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse responses, second pulses) of the low-pass filter each with a distance corresponding to an average pitch (average pitch) value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
EP0834863A2
CLAIM 4
A speech coder comprising a spectral parameter computer for obtaining spectral parameters from an input speech signal and quantizing the spectral parameters thus obtained , an impulse response computer for computing impulse responses (impulse responses, impulse response) corresponding to the spectral parameters , a first correlation computer for computing correlations of the input signal and the impulse response , a second correlation computer for computing correlations among the impulse responses , a first pulse data computer for computing positions of first pulses from the outputs of the first and second correlation computers , a third correlation computer for correcting the output of the first correlation computer by using the output of the first pulse data computer , and a second pulse data computer for computing positions of second pulses (impulse responses, impulse response) from the outputs of the third and second correlation computers , the pulse data computation being made by executing the correlation correction and the pulse data computation iteratedly a predetermined number of times .

EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

EP0834863A2
CLAIM 13
The speech coder according to claim 12 , wherein the feature quantity is an average pitch (average pitch) prediction gain .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP0834863A2
CLAIM 2
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , an excitation quantizer for retrieving positions of M non-zero amplitude pulses which constitutes an excitation signal of the input speech signal with a different gain for each group of the pulses less in number than M , and a second excitation quantizer for retrieving the positions of a predetermined number (last non) of pulses by using the spectral parameters , the outputs of the first and second excitation quantizers being used to compute distortions of the speech so as to select the less distortion one of the first and second excitation quantizers .

EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

EP0834863A2
CLAIM 12
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , a mode judging means for extracting a characteristic amount from the input speech signal , judging a plurality of modes from the extracted feature quantity , and outputting mode data , an adaptive codebook means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and making pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude signals , obtaining a sample position meeting a predetermined condition with respect to the pitch prediction signal when the mode data represents a (LP filter excitation signal) predetermined mode , setting a pulse position retrieval range on the basis of the obtained sample position , retrieving a best position in the pulse position retrieval range , and outputting data of the retrieved best position .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (zero amplitude) , E LPO is an energy of an impulse response (impulse responses, second pulses) of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP0834863A2
CLAIM 1
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , and an excitation quantizer for retrieving positions of M non-zero amplitude (current frame) pulses which constitutes an excitation signal of the input speech signal with a different gain for each set of the pulses for each group of pulses less in number than M .

EP0834863A2
CLAIM 2
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , an excitation quantizer for retrieving positions of M non-zero amplitude pulses which constitutes an excitation signal of the input speech signal with a different gain for each group of the pulses less in number than M , and a second excitation quantizer for retrieving the positions of a predetermined number (last non) of pulses by using the spectral parameters , the outputs of the first and second excitation quantizers being used to compute distortions of the speech so as to select the less distortion one of the first and second excitation quantizers .

EP0834863A2
CLAIM 4
A speech coder comprising a spectral parameter computer for obtaining spectral parameters from an input speech signal and quantizing the spectral parameters thus obtained , an impulse response computer for computing impulse responses (impulse responses, impulse response) corresponding to the spectral parameters , a first correlation computer for computing correlations of the input signal and the impulse response , a second correlation computer for computing correlations among the impulse responses , a first pulse data computer for computing positions of first pulses from the outputs of the first and second correlation computers , a third correlation computer for correcting the output of the first correlation computer by using the output of the first pulse data computer , and a second pulse data computer for computing positions of second pulses (impulse responses, impulse response) from the outputs of the third and second correlation computers , the pulse data computation being made by executing the correlation correction and the pulse data computation iteratedly a predetermined number of times .

EP0834863A2
CLAIM 12
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , a mode judging means for extracting a characteristic amount from the input speech signal , judging a plurality of modes from the extracted feature quantity , and outputting mode data , an adaptive codebook means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and making pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude signals , obtaining a sample position meeting a predetermined condition with respect to the pitch prediction signal when the mode data represents a (LP filter excitation signal) predetermined mode , setting a pulse position retrieval range on the basis of the obtained sample position , retrieving a best position in the pulse position retrieval range , and outputting data of the retrieved best position .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (zero amplitude) , E LPO is an energy of an impulse response (impulse responses, second pulses) of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP0834863A2
CLAIM 1
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , and an excitation quantizer for retrieving positions of M non-zero amplitude (current frame) pulses which constitutes an excitation signal of the input speech signal with a different gain for each set of the pulses for each group of pulses less in number than M .

EP0834863A2
CLAIM 2
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , an excitation quantizer for retrieving positions of M non-zero amplitude pulses which constitutes an excitation signal of the input speech signal with a different gain for each group of the pulses less in number than M , and a second excitation quantizer for retrieving the positions of a predetermined number (last non) of pulses by using the spectral parameters , the outputs of the first and second excitation quantizers being used to compute distortions of the speech so as to select the less distortion one of the first and second excitation quantizers .

EP0834863A2
CLAIM 4
A speech coder comprising a spectral parameter computer for obtaining spectral parameters from an input speech signal and quantizing the spectral parameters thus obtained , an impulse response computer for computing impulse responses (impulse responses, impulse response) corresponding to the spectral parameters , a first correlation computer for computing correlations of the input signal and the impulse response , a second correlation computer for computing correlations among the impulse responses , a first pulse data computer for computing positions of first pulses from the outputs of the first and second correlation computers , a third correlation computer for correcting the output of the first correlation computer by using the output of the first pulse data computer , and a second pulse data computer for computing positions of second pulses (impulse responses, impulse response) from the outputs of the third and second correlation computers , the pulse data computation being made by executing the correlation correction and the pulse data computation iteratedly a predetermined number of times .

EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

EP0834863A2
CLAIM 12
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , a mode judging means for extracting a characteristic amount from the input speech signal , judging a plurality of modes from the extracted feature quantity , and outputting mode data , an adaptive codebook means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and making pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude signals , obtaining a sample position meeting a predetermined condition with respect to the pitch prediction signal when the mode data represents a (LP filter excitation signal) predetermined mode , setting a pulse position retrieval range on the basis of the obtained sample position , retrieving a best position in the pulse position retrieval range , and outputting data of the retrieved best position .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse responses, second pulses) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse responses, second pulses) of the low-pass filter each with a distance corresponding to an average pitch (average pitch) value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
EP0834863A2
CLAIM 4
A speech coder comprising a spectral parameter computer for obtaining spectral parameters from an input speech signal and quantizing the spectral parameters thus obtained , an impulse response computer for computing impulse responses (impulse responses, impulse response) corresponding to the spectral parameters , a first correlation computer for computing correlations of the input signal and the impulse response , a second correlation computer for computing correlations among the impulse responses , a first pulse data computer for computing positions of first pulses from the outputs of the first and second correlation computers , a third correlation computer for correcting the output of the first correlation computer by using the output of the first pulse data computer , and a second pulse data computer for computing positions of second pulses (impulse responses, impulse response) from the outputs of the third and second correlation computers , the pulse data computation being made by executing the correlation correction and the pulse data computation iteratedly a predetermined number of times .

EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

EP0834863A2
CLAIM 13
The speech coder according to claim 12 , wherein the feature quantity is an average pitch (average pitch) prediction gain .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP0834863A2
CLAIM 2
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , an excitation quantizer for retrieving positions of M non-zero amplitude pulses which constitutes an excitation signal of the input speech signal with a different gain for each group of the pulses less in number than M , and a second excitation quantizer for retrieving the positions of a predetermined number (last non) of pulses by using the spectral parameters , the outputs of the first and second excitation quantizers being used to compute distortions of the speech so as to select the less distortion one of the first and second excitation quantizers .

EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

EP0834863A2
CLAIM 12
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , a mode judging means for extracting a characteristic amount from the input speech signal , judging a plurality of modes from the extracted feature quantity , and outputting mode data , an adaptive codebook means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and making pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude signals , obtaining a sample position meeting a predetermined condition with respect to the pitch prediction signal when the mode data represents a (LP filter excitation signal) predetermined mode , setting a pulse position retrieval range on the basis of the obtained sample position , retrieving a best position in the pulse position retrieval range , and outputting data of the retrieved best position .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (zero amplitude) , E LPO is an energy of an impulse response (impulse responses, second pulses) of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP0834863A2
CLAIM 1
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , and an excitation quantizer for retrieving positions of M non-zero amplitude (current frame) pulses which constitutes an excitation signal of the input speech signal with a different gain for each set of the pulses for each group of pulses less in number than M .

EP0834863A2
CLAIM 2
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , an excitation quantizer for retrieving positions of M non-zero amplitude pulses which constitutes an excitation signal of the input speech signal with a different gain for each group of the pulses less in number than M , and a second excitation quantizer for retrieving the positions of a predetermined number (last non) of pulses by using the spectral parameters , the outputs of the first and second excitation quantizers being used to compute distortions of the speech so as to select the less distortion one of the first and second excitation quantizers .

EP0834863A2
CLAIM 4
A speech coder comprising a spectral parameter computer for obtaining spectral parameters from an input speech signal and quantizing the spectral parameters thus obtained , an impulse response computer for computing impulse responses (impulse responses, impulse response) corresponding to the spectral parameters , a first correlation computer for computing correlations of the input signal and the impulse response , a second correlation computer for computing correlations among the impulse responses , a first pulse data computer for computing positions of first pulses from the outputs of the first and second correlation computers , a third correlation computer for correcting the output of the first correlation computer by using the output of the first pulse data computer , and a second pulse data computer for computing positions of second pulses (impulse responses, impulse response) from the outputs of the third and second correlation computers , the pulse data computation being made by executing the correlation correction and the pulse data computation iteratedly a predetermined number of times .

EP0834863A2
CLAIM 12
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , a mode judging means for extracting a characteristic amount from the input speech signal , judging a plurality of modes from the extracted feature quantity , and outputting mode data , an adaptive codebook means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and making pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude signals , obtaining a sample position meeting a predetermined condition with respect to the pitch prediction signal when the mode data represents a (LP filter excitation signal) predetermined mode , setting a pulse position retrieval range on the basis of the obtained sample position , retrieving a best position in the pulse position retrieval range , and outputting data of the retrieved best position .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (zero amplitude) , E LPO is an energy of an impulse response (impulse responses, second pulses) of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
EP0834863A2
CLAIM 1
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , and an excitation quantizer for retrieving positions of M non-zero amplitude (current frame) pulses which constitutes an excitation signal of the input speech signal with a different gain for each set of the pulses for each group of pulses less in number than M .

EP0834863A2
CLAIM 2
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , an excitation quantizer for retrieving positions of M non-zero amplitude pulses which constitutes an excitation signal of the input speech signal with a different gain for each group of the pulses less in number than M , and a second excitation quantizer for retrieving the positions of a predetermined number (last non) of pulses by using the spectral parameters , the outputs of the first and second excitation quantizers being used to compute distortions of the speech so as to select the less distortion one of the first and second excitation quantizers .

EP0834863A2
CLAIM 4
A speech coder comprising a spectral parameter computer for obtaining spectral parameters from an input speech signal and quantizing the spectral parameters thus obtained , an impulse response computer for computing impulse responses (impulse responses, impulse response) corresponding to the spectral parameters , a first correlation computer for computing correlations of the input signal and the impulse response , a second correlation computer for computing correlations among the impulse responses , a first pulse data computer for computing positions of first pulses from the outputs of the first and second correlation computers , a third correlation computer for correcting the output of the first correlation computer by using the output of the first pulse data computer , and a second pulse data computer for computing positions of second pulses (impulse responses, impulse response) from the outputs of the third and second correlation computers , the pulse data computation being made by executing the correlation correction and the pulse data computation iteratedly a predetermined number of times .

EP0834863A2
CLAIM 5
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

EP0834863A2
CLAIM 12
A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , a mode judging means for extracting a characteristic amount from the input speech signal , judging a plurality of modes from the extracted feature quantity , and outputting mode data , an adaptive codebook means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and making pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude signals , obtaining a sample position meeting a predetermined condition with respect to the pitch prediction signal when the mode data represents a (LP filter excitation signal) predetermined mode , setting a pulse position retrieval range on the basis of the obtained sample position , retrieving a best position in the pulse position retrieval range , and outputting data of the retrieved best position .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5963896A

Filed: 1997-08-26     Issued: 1999-10-05

Speech coder including an excitation quantizer for retrieving positions of amplitude pulses using spectral parameters and different gains for groups of the pulses

(Original Assignee) NEC Corp     (Current Assignee) Rakuten Inc

Kazunori Ozawa
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse responses, second pulses) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse responses, second pulses) of the low-pass filter each with a distance corresponding to an average pitch (average pitch) value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5963896A
CLAIM 7
. A speech coder comprising a spectral parameter computer for obtaining spectral parameters from an input speech signal and quantizing the spectral parameters thus obtained , an impulse response computer for computing impulse responses (impulse responses, impulse response) corresponding to the spectral parameters , a first correlation computer for computing correlations of the input signal and the impulse response , a second correlation computer for computing correlations among the impulse responses , a first pulse data computer for computing positions of first pulses from the outputs of the first and second correlation computers , a third correlation computer for correcting the output of the first correlation computer by using the output of the first pulse data computer , and a second pulse data computer for computing positions of second pulses (impulse responses, impulse response) from the outputs of the third and second correlation computers , the pulse data computation being made by executing the correlation correction and the pulse data computation iteratedly a predetermined number of times .

US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US5963896A
CLAIM 21
. The speech coder according to claim 20 , wherein the feature quantity is an average pitch (average pitch) prediction gain .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5963896A
CLAIM 3
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , an excitation quantizer for retrieving positions of M non-zero amplitude pulses which constitutes an excitation signal of the input speech signal with a different gain for each group of the pulses less in number than M , and a second excitation quantizer for retrieving the positions of a predetermined number (last non) of pulses by using the spectral parameters , the outputs of the first and second excitation quantizers being used to compute distortions of the speech so as to select the less distortion one of the first and second excitation quantizers .

US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US5963896A
CLAIM 20
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , a mode judging means for extracting a characteristic amount from the input speech signal , judging a plurality of modes from the extracted feature quantity , and outputting mode data , an adaptive codebook means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and making pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude signals , obtaining a sample position meeting a predetermined condition with respect to the pitch prediction signal when the mode data represents a (LP filter excitation signal) predetermined mode , setting a pulse position retrieval range on the basis of the obtained sample position , retrieving a best position in the pulse position retrieval range , and outputting data of the retrieved best position .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (zero amplitude) , E LPO is an energy of an impulse response (impulse responses, second pulses) of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5963896A
CLAIM 1
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , and an excitation quantizer for retrieving positions of M non-zero amplitude (current frame) pulses which constitutes an excitation signal of the input speech signal with a different gain for each set of the pulses for each group of pulses less in number than M .

US5963896A
CLAIM 3
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , an excitation quantizer for retrieving positions of M non-zero amplitude pulses which constitutes an excitation signal of the input speech signal with a different gain for each group of the pulses less in number than M , and a second excitation quantizer for retrieving the positions of a predetermined number (last non) of pulses by using the spectral parameters , the outputs of the first and second excitation quantizers being used to compute distortions of the speech so as to select the less distortion one of the first and second excitation quantizers .

US5963896A
CLAIM 7
. A speech coder comprising a spectral parameter computer for obtaining spectral parameters from an input speech signal and quantizing the spectral parameters thus obtained , an impulse response computer for computing impulse responses (impulse responses, impulse response) corresponding to the spectral parameters , a first correlation computer for computing correlations of the input signal and the impulse response , a second correlation computer for computing correlations among the impulse responses , a first pulse data computer for computing positions of first pulses from the outputs of the first and second correlation computers , a third correlation computer for correcting the output of the first correlation computer by using the output of the first pulse data computer , and a second pulse data computer for computing positions of second pulses (impulse responses, impulse response) from the outputs of the third and second correlation computers , the pulse data computation being made by executing the correlation correction and the pulse data computation iteratedly a predetermined number of times .

US5963896A
CLAIM 20
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , a mode judging means for extracting a characteristic amount from the input speech signal , judging a plurality of modes from the extracted feature quantity , and outputting mode data , an adaptive codebook means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and making pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude signals , obtaining a sample position meeting a predetermined condition with respect to the pitch prediction signal when the mode data represents a (LP filter excitation signal) predetermined mode , setting a pulse position retrieval range on the basis of the obtained sample position , retrieving a best position in the pulse position retrieval range , and outputting data of the retrieved best position .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (zero amplitude) , E LPO is an energy of an impulse response (impulse responses, second pulses) of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5963896A
CLAIM 1
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , and an excitation quantizer for retrieving positions of M non-zero amplitude (current frame) pulses which constitutes an excitation signal of the input speech signal with a different gain for each set of the pulses for each group of pulses less in number than M .

US5963896A
CLAIM 3
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , an excitation quantizer for retrieving positions of M non-zero amplitude pulses which constitutes an excitation signal of the input speech signal with a different gain for each group of the pulses less in number than M , and a second excitation quantizer for retrieving the positions of a predetermined number (last non) of pulses by using the spectral parameters , the outputs of the first and second excitation quantizers being used to compute distortions of the speech so as to select the less distortion one of the first and second excitation quantizers .

US5963896A
CLAIM 7
. A speech coder comprising a spectral parameter computer for obtaining spectral parameters from an input speech signal and quantizing the spectral parameters thus obtained , an impulse response computer for computing impulse responses (impulse responses, impulse response) corresponding to the spectral parameters , a first correlation computer for computing correlations of the input signal and the impulse response , a second correlation computer for computing correlations among the impulse responses , a first pulse data computer for computing positions of first pulses from the outputs of the first and second correlation computers , a third correlation computer for correcting the output of the first correlation computer by using the output of the first pulse data computer , and a second pulse data computer for computing positions of second pulses (impulse responses, impulse response) from the outputs of the third and second correlation computers , the pulse data computation being made by executing the correlation correction and the pulse data computation iteratedly a predetermined number of times .

US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US5963896A
CLAIM 20
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , a mode judging means for extracting a characteristic amount from the input speech signal , judging a plurality of modes from the extracted feature quantity , and outputting mode data , an adaptive codebook means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and making pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude signals , obtaining a sample position meeting a predetermined condition with respect to the pitch prediction signal when the mode data represents a (LP filter excitation signal) predetermined mode , setting a pulse position retrieval range on the basis of the obtained sample position , retrieving a best position in the pulse position retrieval range , and outputting data of the retrieved best position .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse responses, second pulses) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse responses, second pulses) of the low-pass filter each with a distance corresponding to an average pitch (average pitch) value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5963896A
CLAIM 7
. A speech coder comprising a spectral parameter computer for obtaining spectral parameters from an input speech signal and quantizing the spectral parameters thus obtained , an impulse response computer for computing impulse responses (impulse responses, impulse response) corresponding to the spectral parameters , a first correlation computer for computing correlations of the input signal and the impulse response , a second correlation computer for computing correlations among the impulse responses , a first pulse data computer for computing positions of first pulses from the outputs of the first and second correlation computers , a third correlation computer for correcting the output of the first correlation computer by using the output of the first pulse data computer , and a second pulse data computer for computing positions of second pulses (impulse responses, impulse response) from the outputs of the third and second correlation computers , the pulse data computation being made by executing the correlation correction and the pulse data computation iteratedly a predetermined number of times .

US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US5963896A
CLAIM 21
. The speech coder according to claim 20 , wherein the feature quantity is an average pitch (average pitch) prediction gain .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5963896A
CLAIM 3
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , an excitation quantizer for retrieving positions of M non-zero amplitude pulses which constitutes an excitation signal of the input speech signal with a different gain for each group of the pulses less in number than M , and a second excitation quantizer for retrieving the positions of a predetermined number (last non) of pulses by using the spectral parameters , the outputs of the first and second excitation quantizers being used to compute distortions of the speech so as to select the less distortion one of the first and second excitation quantizers .

US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US5963896A
CLAIM 20
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , a mode judging means for extracting a characteristic amount from the input speech signal , judging a plurality of modes from the extracted feature quantity , and outputting mode data , an adaptive codebook means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and making pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude signals , obtaining a sample position meeting a predetermined condition with respect to the pitch prediction signal when the mode data represents a (LP filter excitation signal) predetermined mode , setting a pulse position retrieval range on the basis of the obtained sample position , retrieving a best position in the pulse position retrieval range , and outputting data of the retrieved best position .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (zero amplitude) , E LPO is an energy of an impulse response (impulse responses, second pulses) of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5963896A
CLAIM 1
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , and an excitation quantizer for retrieving positions of M non-zero amplitude (current frame) pulses which constitutes an excitation signal of the input speech signal with a different gain for each set of the pulses for each group of pulses less in number than M .

US5963896A
CLAIM 3
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , an excitation quantizer for retrieving positions of M non-zero amplitude pulses which constitutes an excitation signal of the input speech signal with a different gain for each group of the pulses less in number than M , and a second excitation quantizer for retrieving the positions of a predetermined number (last non) of pulses by using the spectral parameters , the outputs of the first and second excitation quantizers being used to compute distortions of the speech so as to select the less distortion one of the first and second excitation quantizers .

US5963896A
CLAIM 7
. A speech coder comprising a spectral parameter computer for obtaining spectral parameters from an input speech signal and quantizing the spectral parameters thus obtained , an impulse response computer for computing impulse responses (impulse responses, impulse response) corresponding to the spectral parameters , a first correlation computer for computing correlations of the input signal and the impulse response , a second correlation computer for computing correlations among the impulse responses , a first pulse data computer for computing positions of first pulses from the outputs of the first and second correlation computers , a third correlation computer for correcting the output of the first correlation computer by using the output of the first pulse data computer , and a second pulse data computer for computing positions of second pulses (impulse responses, impulse response) from the outputs of the third and second correlation computers , the pulse data computation being made by executing the correlation correction and the pulse data computation iteratedly a predetermined number of times .

US5963896A
CLAIM 20
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , a mode judging means for extracting a characteristic amount from the input speech signal , judging a plurality of modes from the extracted feature quantity , and outputting mode data , an adaptive codebook means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and making pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude signals , obtaining a sample position meeting a predetermined condition with respect to the pitch prediction signal when the mode data represents a (LP filter excitation signal) predetermined mode , setting a pulse position retrieval range on the basis of the obtained sample position , retrieving a best position in the pulse position retrieval range , and outputting data of the retrieved best position .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (zero amplitude) , E LPO is an energy of an impulse response (impulse responses, second pulses) of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5963896A
CLAIM 1
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , and an excitation quantizer for retrieving positions of M non-zero amplitude (current frame) pulses which constitutes an excitation signal of the input speech signal with a different gain for each set of the pulses for each group of pulses less in number than M .

US5963896A
CLAIM 3
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal , and quantizing the spectral parameters thus obtained , an excitation quantizer for retrieving positions of M non-zero amplitude pulses which constitutes an excitation signal of the input speech signal with a different gain for each group of the pulses less in number than M , and a second excitation quantizer for retrieving the positions of a predetermined number (last non) of pulses by using the spectral parameters , the outputs of the first and second excitation quantizers being used to compute distortions of the speech so as to select the less distortion one of the first and second excitation quantizers .

US5963896A
CLAIM 7
. A speech coder comprising a spectral parameter computer for obtaining spectral parameters from an input speech signal and quantizing the spectral parameters thus obtained , an impulse response computer for computing impulse responses (impulse responses, impulse response) corresponding to the spectral parameters , a first correlation computer for computing correlations of the input signal and the impulse response , a second correlation computer for computing correlations among the impulse responses , a first pulse data computer for computing positions of first pulses from the outputs of the first and second correlation computers , a third correlation computer for correcting the output of the first correlation computer by using the output of the first pulse data computer , and a second pulse data computer for computing positions of second pulses (impulse responses, impulse response) from the outputs of the third and second correlation computers , the pulse data computation being made by executing the correlation correction and the pulse data computation iteratedly a predetermined number of times .

US5963896A
CLAIM 8
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , an adaptive codebook (sound signal, speech signal) means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and executing pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude pulses , obtaining a sample position corresponding to a pulse position meeting a predetermined condition with respect to the computed pitch prediction signal , setting a pulse position retrieval range on the basis of a position obtained by shifting the obtained sample position by a predetermined number of samples , retrieving a best position in the pulse position retrieval range thus set , and outputting data of the retrieved best position .

US5963896A
CLAIM 20
. A speech coder comprising a spectral parameter computer for obtaining a plurality of spectral parameters from an input speech signal and quantizing the obtained spectral parameters , a mode judging means for extracting a characteristic amount from the input speech signal , judging a plurality of modes from the extracted feature quantity , and outputting mode data , an adaptive codebook means for obtaining a delay corresponding to a pitch period from the input speech signal , computing a pitch prediction signal , and making pitch prediction , and an excitation quantizer for forming an excitation signal of the input speech signal with M non-zero amplitude signals , obtaining a sample position meeting a predetermined condition with respect to the pitch prediction signal when the mode data represents a (LP filter excitation signal) predetermined mode , setting a pulse position retrieval range on the basis of the obtained sample position , retrieving a best position in the pulse position retrieval range , and outputting data of the retrieved best position .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5956672A

Filed: 1997-08-15     Issued: 1999-09-21

Wide-band speech spectral quantizer

(Original Assignee) NEC Corp     (Current Assignee) NEC Corp

Masahiro Serizawa
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (predetermined frequency) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5956672A
CLAIM 9
. A spectral quantizer for wide-band speech comprising : a frame circuit for cutting out frames with a predetermined window length from a speech signal ;
a band splitter for making predetermined frequency (placing remaining impulse responses) band splitting and computing each sub-band spectral coefficients ;
an analyzer for computing a spectral coefficient vector of each sub-band ;
an adder for obtaining a result of subtraction of each sub-band predicted spectral coefficient vector computed in the band splitter from the spectral coefficient vector ;
a quantizer for quantizing a result of subtraction for the full band , thus outputting a quantized prediction error vector ;
means for generating a full-band quantized vector by combining the quantized prediction error vectors of all the sub-bands ;
a synthesizer for outputting a full-band spectral coefficient vector by combining the spectral coefficient vectors of all the sub-bands received from the analyzer ;
an optimum prediction circuit for computing a full-band predicted spectral coefficient vector from the full-band quantized vector received from the quantizer and the full-band predicted spectral coefficient vector ;
and a band splitter for band splitting the full-band predicted spectral coefficient vector , and computing each sub-band predicted spectral coefficient vector .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5956672A
CLAIM 1
. A wide-band speech spectral quantizer comprising : a first means for splitting a frame speech signal (speech signal, decoder determines concealment) into a plurality of split signals ;
a second means for developing developed coefficients representing a frequency characteristic of each split signal ;
a third means for obtaining subtraction results by subtracting predicted coefficients from the developed coefficients ;
a fourth means for quantizing the subtraction results concerning the plurality of split signals and developing a quantization result of each split signal and a quantized synthesis resulting concerning the plurality of split signals ;
a fifth means for developing quantized coefficients concerning each split signal on the basis of the quantization result and the predicted coefficients ;
a sixth means for outputting the quantized coefficients ;
a seventh means for developing synthesized coefficients concerning the plurality of split signals by synthesizing the developed coefficients ;
an eights means for developing predicted synthesis coefficients concerning the synthesized coefficients on the basis of the quantized synthesis result and the synthesized coefficients ;
and a ninth means for developing the predicted coefficients concerning each split signal on the basis of the predicted synthesis coefficients .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5956672A
CLAIM 1
. A wide-band speech spectral quantizer comprising : a first means for splitting a frame speech signal (speech signal, decoder determines concealment) into a plurality of split signals ;
a second means for developing developed coefficients representing a frequency characteristic of each split signal ;
a third means for obtaining subtraction results by subtracting predicted coefficients from the developed coefficients ;
a fourth means for quantizing the subtraction results concerning the plurality of split signals and developing a quantization result of each split signal and a quantized synthesis resulting concerning the plurality of split signals ;
a fifth means for developing quantized coefficients concerning each split signal on the basis of the quantization result and the predicted coefficients ;
a sixth means for outputting the quantized coefficients ;
a seventh means for developing synthesized coefficients concerning the plurality of split signals by synthesizing the developed coefficients ;
an eights means for developing predicted synthesis coefficients concerning the synthesized coefficients on the basis of the quantized synthesis result and the synthesized coefficients ;
and a ninth means for developing the predicted coefficients concerning each split signal on the basis of the predicted synthesis coefficients .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5956672A
CLAIM 1
. A wide-band speech spectral quantizer comprising : a first means for splitting a frame speech signal (speech signal, decoder determines concealment) into a plurality of split signals ;
a second means for developing developed coefficients representing a frequency characteristic of each split signal ;
a third means for obtaining subtraction results by subtracting predicted coefficients from the developed coefficients ;
a fourth means for quantizing the subtraction results concerning the plurality of split signals and developing a quantization result of each split signal and a quantized synthesis resulting concerning the plurality of split signals ;
a fifth means for developing quantized coefficients concerning each split signal on the basis of the quantization result and the predicted coefficients ;
a sixth means for outputting the quantized coefficients ;
a seventh means for developing synthesized coefficients concerning the plurality of split signals by synthesizing the developed coefficients ;
an eights means for developing predicted synthesis coefficients concerning the synthesized coefficients on the basis of the quantized synthesis result and the synthesized coefficients ;
and a ninth means for developing the predicted coefficients concerning each split signal on the basis of the predicted synthesis coefficients .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (predetermined frequency) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5956672A
CLAIM 9
. A spectral quantizer for wide-band speech comprising : a frame circuit for cutting out frames with a predetermined window length from a speech signal ;
a band splitter for making predetermined frequency (placing remaining impulse responses) band splitting and computing each sub-band spectral coefficients ;
an analyzer for computing a spectral coefficient vector of each sub-band ;
an adder for obtaining a result of subtraction of each sub-band predicted spectral coefficient vector computed in the band splitter from the spectral coefficient vector ;
a quantizer for quantizing a result of subtraction for the full band , thus outputting a quantized prediction error vector ;
means for generating a full-band quantized vector by combining the quantized prediction error vectors of all the sub-bands ;
a synthesizer for outputting a full-band spectral coefficient vector by combining the spectral coefficient vectors of all the sub-bands received from the analyzer ;
an optimum prediction circuit for computing a full-band predicted spectral coefficient vector from the full-band quantized vector received from the quantizer and the full-band predicted spectral coefficient vector ;
and a band splitter for band splitting the full-band predicted spectral coefficient vector , and computing each sub-band predicted spectral coefficient vector .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5956672A
CLAIM 1
. A wide-band speech spectral quantizer comprising : a first means for splitting a frame speech signal (speech signal, decoder determines concealment) into a plurality of split signals ;
a second means for developing developed coefficients representing a frequency characteristic of each split signal ;
a third means for obtaining subtraction results by subtracting predicted coefficients from the developed coefficients ;
a fourth means for quantizing the subtraction results concerning the plurality of split signals and developing a quantization result of each split signal and a quantized synthesis resulting concerning the plurality of split signals ;
a fifth means for developing quantized coefficients concerning each split signal on the basis of the quantization result and the predicted coefficients ;
a sixth means for outputting the quantized coefficients ;
a seventh means for developing synthesized coefficients concerning the plurality of split signals by synthesizing the developed coefficients ;
an eights means for developing predicted synthesis coefficients concerning the synthesized coefficients on the basis of the quantized synthesis result and the synthesized coefficients ;
and a ninth means for developing the predicted coefficients concerning each split signal on the basis of the predicted synthesis coefficients .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5956672A
CLAIM 1
. A wide-band speech spectral quantizer comprising : a first means for splitting a frame speech signal (speech signal, decoder determines concealment) into a plurality of split signals ;
a second means for developing developed coefficients representing a frequency characteristic of each split signal ;
a third means for obtaining subtraction results by subtracting predicted coefficients from the developed coefficients ;
a fourth means for quantizing the subtraction results concerning the plurality of split signals and developing a quantization result of each split signal and a quantized synthesis resulting concerning the plurality of split signals ;
a fifth means for developing quantized coefficients concerning each split signal on the basis of the quantization result and the predicted coefficients ;
a sixth means for outputting the quantized coefficients ;
a seventh means for developing synthesized coefficients concerning the plurality of split signals by synthesizing the developed coefficients ;
an eights means for developing predicted synthesis coefficients concerning the synthesized coefficients on the basis of the quantized synthesis result and the synthesized coefficients ;
and a ninth means for developing the predicted coefficients concerning each split signal on the basis of the predicted synthesis coefficients .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5956672A
CLAIM 1
. A wide-band speech spectral quantizer comprising : a first means for splitting a frame speech signal (speech signal, decoder determines concealment) into a plurality of split signals ;
a second means for developing developed coefficients representing a frequency characteristic of each split signal ;
a third means for obtaining subtraction results by subtracting predicted coefficients from the developed coefficients ;
a fourth means for quantizing the subtraction results concerning the plurality of split signals and developing a quantization result of each split signal and a quantized synthesis resulting concerning the plurality of split signals ;
a fifth means for developing quantized coefficients concerning each split signal on the basis of the quantization result and the predicted coefficients ;
a sixth means for outputting the quantized coefficients ;
a seventh means for developing synthesized coefficients concerning the plurality of split signals by synthesizing the developed coefficients ;
an eights means for developing predicted synthesis coefficients concerning the synthesized coefficients on the basis of the quantized synthesis result and the synthesized coefficients ;
and a ninth means for developing the predicted coefficients concerning each split signal on the basis of the predicted synthesis coefficients .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5956672A
CLAIM 1
. A wide-band speech spectral quantizer comprising : a first means for splitting a frame speech signal (speech signal, decoder determines concealment) into a plurality of split signals ;
a second means for developing developed coefficients representing a frequency characteristic of each split signal ;
a third means for obtaining subtraction results by subtracting predicted coefficients from the developed coefficients ;
a fourth means for quantizing the subtraction results concerning the plurality of split signals and developing a quantization result of each split signal and a quantized synthesis resulting concerning the plurality of split signals ;
a fifth means for developing quantized coefficients concerning each split signal on the basis of the quantization result and the predicted coefficients ;
a sixth means for outputting the quantized coefficients ;
a seventh means for developing synthesized coefficients concerning the plurality of split signals by synthesizing the developed coefficients ;
an eights means for developing predicted synthesis coefficients concerning the synthesized coefficients on the basis of the quantized synthesis result and the synthesized coefficients ;
and a ninth means for developing the predicted coefficients concerning each split signal on the basis of the predicted synthesis coefficients .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH1130997A

Filed: 1997-07-11     Issued: 1999-02-02

音声符号化復号装置

(Original Assignee) Nec Corp; 日本電気株式会社     

Toshiyuki Nomura, 俊之 野村
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (入力音) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JPH1130997A
CLAIM 1
【請求項1】音声信号を階層的に符号化する際に、入力 音声信号のサンプリング周波数を変化させた信号をN− 1個作成し、入力音 (first impulse) 声信号と前記サンプリング周波数を 変化させた信号を、サンプリング周波数が低い信号か ら、順次、符号化して得た線形予測係数とピッチとマル チパルス信号とゲインとを表すインデックスをN階層分 まとめて多重化する音声符号化装置であって、 第n階層(n=2、…、N)の符号化手段において、第 n−1階層までに符号化復号したピッチに対する差分ピ ッチを符号化し、対応する適応コードベクトル信号を作 成する適応コードブック探索回路を少なくとも含むこと を特徴とする音声符号化装置。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (入力音) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JPH1130997A
CLAIM 1
【請求項1】音声信号を階層的に符号化する際に、入力 音声信号のサンプリング周波数を変化させた信号をN− 1個作成し、入力音 (first impulse) 声信号と前記サンプリング周波数を 変化させた信号を、サンプリング周波数が低い信号か ら、順次、符号化して得た線形予測係数とピッチとマル チパルス信号とゲインとを表すインデックスをN階層分 まとめて多重化する音声符号化装置であって、 第n階層(n=2、…、N)の符号化手段において、第 n−1階層までに符号化復号したピッチに対する差分ピ ッチを符号化し、対応する適応コードベクトル信号を作 成する適応コードブック探索回路を少なくとも含むこと を特徴とする音声符号化装置。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5924062A

Filed: 1997-07-01     Issued: 1999-07-13

ACLEP codec with modified autocorrelation matrix storage and search

(Original Assignee) Nokia Mobile Phones Ltd     (Current Assignee) Qualcomm Inc

Tin Maung
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (impulse response vector, response signal) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response vector, response signal) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5924062A
CLAIM 2
. The memory of claim 1 , wherein the N×N correlation matrix is a 40×40 matrix computed by autocorrelation of a 40 sample weighted impulse response vector (first impulse, impulse responses) obtained from a 40 sample sub-frame from a speech signal , the 40 sample weighted impulse vector having a sign vector incorporated .

US5924062A
CLAIM 18
. A method performed in a digital signal processor having a memory and correlator , the method for storing and searching an autocorrelation matrix in an EFR-ACELP codec implemented in the digital signal processor , the correlator for computing a plurality of correlation coefficients for generating the autocorrelation matrix from a 40 sample weighted impulse response signal (first impulse, impulse responses) obtained from a 40 sample subframe , the method comprising : dividing the 40 sample subframe into five tracks , each track comprising a set of eight pulse positions spaced five pulse positions apart from a preceding pulse position , each track having a unique set of eight pulse positions ;
defining a set of fifteen sub-matrices based on an autocorrelation of each track of the five tracks to itself and on an autocorrelation of each track to at least a portion of the other tracks , each sub-matrix being an 8×8 matrix ;
defining a first mapping matrix having five columns and five rows , each column comprising five at least partially filled sub-matrices of the set of fifteen sub-matrices ;
defining a second mapping matrix containing structure information for correlating with the first mapping matrix for determining a configuration of the at least partially filled sub-matrices ;
and addressing a location corresponding to a column and row combination , each location corresponding to one of the at least partially filled sub-matrices , for connecting the correlator to a position within each at least partial sub-matrix .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy (upper portion) for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5924062A
CLAIM 2
. The memory of claim 1 , wherein the N×N correlation matrix is a 40×40 matrix computed by autocorrelation of a 40 sample weighted impulse response vector obtained from a 40 sample sub-frame from a speech signal (speech signal, decoder determines concealment) , the 40 sample weighted impulse vector having a sign vector incorporated .

US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (first mapping) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US5924062A
CLAIM 18
. A method performed in a digital signal processor having a memory and correlator , the method for storing and searching an autocorrelation matrix in an EFR-ACELP codec implemented in the digital signal processor , the correlator for computing a plurality of correlation coefficients for generating the autocorrelation matrix from a 40 sample weighted impulse response signal obtained from a 40 sample subframe , the method comprising : dividing the 40 sample subframe into five tracks , each track comprising a set of eight pulse positions spaced five pulse positions apart from a preceding pulse position , each track having a unique set of eight pulse positions ;
defining a set of fifteen sub-matrices based on an autocorrelation of each track of the five tracks to itself and on an autocorrelation of each track to at least a portion of the other tracks , each sub-matrix being an 8×8 matrix ;
defining a first mapping (first non) matrix having five columns and five rows , each column comprising five at least partially filled sub-matrices of the set of fifteen sub-matrices ;
defining a second mapping matrix containing structure information for correlating with the first mapping matrix for determining a configuration of the at least partially filled sub-matrices ;
and addressing a location corresponding to a column and row combination , each location corresponding to one of the at least partially filled sub-matrices , for connecting the correlator to a position within each at least partial sub-matrix .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (first mapping) erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5924062A
CLAIM 2
. The memory of claim 1 , wherein the N×N correlation matrix is a 40×40 matrix computed by autocorrelation of a 40 sample weighted impulse response vector obtained from a 40 sample sub-frame from a speech signal (speech signal, decoder determines concealment) , the 40 sample weighted impulse vector having a sign vector incorporated .

US5924062A
CLAIM 18
. A method performed in a digital signal processor having a memory and correlator , the method for storing and searching an autocorrelation matrix in an EFR-ACELP codec implemented in the digital signal processor , the correlator for computing a plurality of correlation coefficients for generating the autocorrelation matrix from a 40 sample weighted impulse response signal obtained from a 40 sample subframe , the method comprising : dividing the 40 sample subframe into five tracks , each track comprising a set of eight pulse positions spaced five pulse positions apart from a preceding pulse position , each track having a unique set of eight pulse positions ;
defining a set of fifteen sub-matrices based on an autocorrelation of each track of the five tracks to itself and on an autocorrelation of each track to at least a portion of the other tracks , each sub-matrix being an 8×8 matrix ;
defining a first mapping (first non) matrix having five columns and five rows , each column comprising five at least partially filled sub-matrices of the set of fifteen sub-matrices ;
defining a second mapping matrix containing structure information for correlating with the first mapping matrix for determining a configuration of the at least partially filled sub-matrices ;
and addressing a location corresponding to a column and row combination , each location corresponding to one of the at least partially filled sub-matrices , for connecting the correlator to a position within each at least partial sub-matrix .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (first mapping) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5924062A
CLAIM 2
. The memory of claim 1 , wherein the N×N correlation matrix is a 40×40 matrix computed by autocorrelation of a 40 sample weighted impulse response vector obtained from a 40 sample sub-frame from a speech signal (speech signal, decoder determines concealment) , the 40 sample weighted impulse vector having a sign vector incorporated .

US5924062A
CLAIM 18
. A method performed in a digital signal processor having a memory and correlator , the method for storing and searching an autocorrelation matrix in an EFR-ACELP codec implemented in the digital signal processor , the correlator for computing a plurality of correlation coefficients for generating the autocorrelation matrix from a 40 sample weighted impulse response signal obtained from a 40 sample subframe , the method comprising : dividing the 40 sample subframe into five tracks , each track comprising a set of eight pulse positions spaced five pulse positions apart from a preceding pulse position , each track having a unique set of eight pulse positions ;
defining a set of fifteen sub-matrices based on an autocorrelation of each track of the five tracks to itself and on an autocorrelation of each track to at least a portion of the other tracks , each sub-matrix being an 8×8 matrix ;
defining a first mapping (first non) matrix having five columns and five rows , each column comprising five at least partially filled sub-matrices of the set of fifteen sub-matrices ;
defining a second mapping matrix containing structure information for correlating with the first mapping matrix for determining a configuration of the at least partially filled sub-matrices ;
and addressing a location corresponding to a column and row combination , each location corresponding to one of the at least partially filled sub-matrices , for connecting the correlator to a position within each at least partial sub-matrix .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (first mapping) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US5924062A
CLAIM 18
. A method performed in a digital signal processor having a memory and correlator , the method for storing and searching an autocorrelation matrix in an EFR-ACELP codec implemented in the digital signal processor , the correlator for computing a plurality of correlation coefficients for generating the autocorrelation matrix from a 40 sample weighted impulse response signal obtained from a 40 sample subframe , the method comprising : dividing the 40 sample subframe into five tracks , each track comprising a set of eight pulse positions spaced five pulse positions apart from a preceding pulse position , each track having a unique set of eight pulse positions ;
defining a set of fifteen sub-matrices based on an autocorrelation of each track of the five tracks to itself and on an autocorrelation of each track to at least a portion of the other tracks , each sub-matrix being an 8×8 matrix ;
defining a first mapping (first non) matrix having five columns and five rows , each column comprising five at least partially filled sub-matrices of the set of fifteen sub-matrices ;
defining a second mapping matrix containing structure information for correlating with the first mapping matrix for determining a configuration of the at least partially filled sub-matrices ;
and addressing a location corresponding to a column and row combination , each location corresponding to one of the at least partially filled sub-matrices , for connecting the correlator to a position within each at least partial sub-matrix .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non (first mapping) erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (T rows) = E 1 ⁢ E (T rows) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5924062A
CLAIM 1
. A memory connected to a correlator in an ACELP codec for storage of an N×N correlation matrix comprising a plurality of correlation coefficients calculated by the correlator , wherein the N×N correlation matrix is a Toeplitz-type matrix having symmetry along a main diagonal and wherein the N×N correlation matrix has an x-axis and a y-axis , the memory comprising : a plurality of tracks having a quantity T corresponding to an integral fraction of N , each track of the plurality of tracks defining a unique sub-set of N ;
a plurality of sub-matrices , each sub-matrix having N/T×N/T positions for receiving a subset of the plurality of correlation coefficients , each sub-matrix being defined by an autocorrelation of two tracks of the plurality of tracks , the two tracks comprising one of an autocorrelation of each track of the plurality of tracks to itself and an autocorrelation of each track of the plurality of tracks to at least a portion of the other tracks of the plurality of tracks ;
a plurality of mapping matrices , at least one mapping matrix containing the plurality of sub-matrices in an arrangement of T rows (⁢ E, E q) and T columns ;
and a pointer for connecting one location selected from the T rows and T columns to the correlator whereby the sub-set of the plurality of correlation coefficients is stored in the sub-matrix corresponding to the one selected location .

US5924062A
CLAIM 18
. A method performed in a digital signal processor having a memory and correlator , the method for storing and searching an autocorrelation matrix in an EFR-ACELP codec implemented in the digital signal processor , the correlator for computing a plurality of correlation coefficients for generating the autocorrelation matrix from a 40 sample weighted impulse response signal obtained from a 40 sample subframe , the method comprising : dividing the 40 sample subframe into five tracks , each track comprising a set of eight pulse positions spaced five pulse positions apart from a preceding pulse position , each track having a unique set of eight pulse positions ;
defining a set of fifteen sub-matrices based on an autocorrelation of each track of the five tracks to itself and on an autocorrelation of each track to at least a portion of the other tracks , each sub-matrix being an 8×8 matrix ;
defining a first mapping (first non) matrix having five columns and five rows , each column comprising five at least partially filled sub-matrices of the set of fifteen sub-matrices ;
defining a second mapping matrix containing structure information for correlating with the first mapping matrix for determining a configuration of the at least partially filled sub-matrices ;
and addressing a location corresponding to a column and row combination , each location corresponding to one of the at least partially filled sub-matrices , for connecting the correlator to a position within each at least partial sub-matrix .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (first mapping) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (T rows) = E 1 ⁢ E (T rows) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5924062A
CLAIM 1
. A memory connected to a correlator in an ACELP codec for storage of an N×N correlation matrix comprising a plurality of correlation coefficients calculated by the correlator , wherein the N×N correlation matrix is a Toeplitz-type matrix having symmetry along a main diagonal and wherein the N×N correlation matrix has an x-axis and a y-axis , the memory comprising : a plurality of tracks having a quantity T corresponding to an integral fraction of N , each track of the plurality of tracks defining a unique sub-set of N ;
a plurality of sub-matrices , each sub-matrix having N/T×N/T positions for receiving a subset of the plurality of correlation coefficients , each sub-matrix being defined by an autocorrelation of two tracks of the plurality of tracks , the two tracks comprising one of an autocorrelation of each track of the plurality of tracks to itself and an autocorrelation of each track of the plurality of tracks to at least a portion of the other tracks of the plurality of tracks ;
a plurality of mapping matrices , at least one mapping matrix containing the plurality of sub-matrices in an arrangement of T rows (⁢ E, E q) and T columns ;
and a pointer for connecting one location selected from the T rows and T columns to the correlator whereby the sub-set of the plurality of correlation coefficients is stored in the sub-matrix corresponding to the one selected location .

US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US5924062A
CLAIM 18
. A method performed in a digital signal processor having a memory and correlator , the method for storing and searching an autocorrelation matrix in an EFR-ACELP codec implemented in the digital signal processor , the correlator for computing a plurality of correlation coefficients for generating the autocorrelation matrix from a 40 sample weighted impulse response signal obtained from a 40 sample subframe , the method comprising : dividing the 40 sample subframe into five tracks , each track comprising a set of eight pulse positions spaced five pulse positions apart from a preceding pulse position , each track having a unique set of eight pulse positions ;
defining a set of fifteen sub-matrices based on an autocorrelation of each track of the five tracks to itself and on an autocorrelation of each track to at least a portion of the other tracks , each sub-matrix being an 8×8 matrix ;
defining a first mapping (first non) matrix having five columns and five rows , each column comprising five at least partially filled sub-matrices of the set of fifteen sub-matrices ;
defining a second mapping matrix containing structure information for correlating with the first mapping matrix for determining a configuration of the at least partially filled sub-matrices ;
and addressing a location corresponding to a column and row combination , each location corresponding to one of the at least partially filled sub-matrices , for connecting the correlator to a position within each at least partial sub-matrix .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (impulse response vector, response signal) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response vector, response signal) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5924062A
CLAIM 2
. The memory of claim 1 , wherein the N×N correlation matrix is a 40×40 matrix computed by autocorrelation of a 40 sample weighted impulse response vector (first impulse, impulse responses) obtained from a 40 sample sub-frame from a speech signal , the 40 sample weighted impulse vector having a sign vector incorporated .

US5924062A
CLAIM 18
. A method performed in a digital signal processor having a memory and correlator , the method for storing and searching an autocorrelation matrix in an EFR-ACELP codec implemented in the digital signal processor , the correlator for computing a plurality of correlation coefficients for generating the autocorrelation matrix from a 40 sample weighted impulse response signal (first impulse, impulse responses) obtained from a 40 sample subframe , the method comprising : dividing the 40 sample subframe into five tracks , each track comprising a set of eight pulse positions spaced five pulse positions apart from a preceding pulse position , each track having a unique set of eight pulse positions ;
defining a set of fifteen sub-matrices based on an autocorrelation of each track of the five tracks to itself and on an autocorrelation of each track to at least a portion of the other tracks , each sub-matrix being an 8×8 matrix ;
defining a first mapping matrix having five columns and five rows , each column comprising five at least partially filled sub-matrices of the set of fifteen sub-matrices ;
defining a second mapping matrix containing structure information for correlating with the first mapping matrix for determining a configuration of the at least partially filled sub-matrices ;
and addressing a location corresponding to a column and row combination , each location corresponding to one of the at least partially filled sub-matrices , for connecting the correlator to a position within each at least partial sub-matrix .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (upper portion) for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5924062A
CLAIM 2
. The memory of claim 1 , wherein the N×N correlation matrix is a 40×40 matrix computed by autocorrelation of a 40 sample weighted impulse response vector obtained from a 40 sample sub-frame from a speech signal (speech signal, decoder determines concealment) , the 40 sample weighted impulse vector having a sign vector incorporated .

US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (first mapping) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US5924062A
CLAIM 18
. A method performed in a digital signal processor having a memory and correlator , the method for storing and searching an autocorrelation matrix in an EFR-ACELP codec implemented in the digital signal processor , the correlator for computing a plurality of correlation coefficients for generating the autocorrelation matrix from a 40 sample weighted impulse response signal obtained from a 40 sample subframe , the method comprising : dividing the 40 sample subframe into five tracks , each track comprising a set of eight pulse positions spaced five pulse positions apart from a preceding pulse position , each track having a unique set of eight pulse positions ;
defining a set of fifteen sub-matrices based on an autocorrelation of each track of the five tracks to itself and on an autocorrelation of each track to at least a portion of the other tracks , each sub-matrix being an 8×8 matrix ;
defining a first mapping (first non) matrix having five columns and five rows , each column comprising five at least partially filled sub-matrices of the set of fifteen sub-matrices ;
defining a second mapping matrix containing structure information for correlating with the first mapping matrix for determining a configuration of the at least partially filled sub-matrices ;
and addressing a location corresponding to a column and row combination , each location corresponding to one of the at least partially filled sub-matrices , for connecting the correlator to a position within each at least partial sub-matrix .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (first mapping) erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5924062A
CLAIM 2
. The memory of claim 1 , wherein the N×N correlation matrix is a 40×40 matrix computed by autocorrelation of a 40 sample weighted impulse response vector obtained from a 40 sample sub-frame from a speech signal (speech signal, decoder determines concealment) , the 40 sample weighted impulse vector having a sign vector incorporated .

US5924062A
CLAIM 18
. A method performed in a digital signal processor having a memory and correlator , the method for storing and searching an autocorrelation matrix in an EFR-ACELP codec implemented in the digital signal processor , the correlator for computing a plurality of correlation coefficients for generating the autocorrelation matrix from a 40 sample weighted impulse response signal obtained from a 40 sample subframe , the method comprising : dividing the 40 sample subframe into five tracks , each track comprising a set of eight pulse positions spaced five pulse positions apart from a preceding pulse position , each track having a unique set of eight pulse positions ;
defining a set of fifteen sub-matrices based on an autocorrelation of each track of the five tracks to itself and on an autocorrelation of each track to at least a portion of the other tracks , each sub-matrix being an 8×8 matrix ;
defining a first mapping (first non) matrix having five columns and five rows , each column comprising five at least partially filled sub-matrices of the set of fifteen sub-matrices ;
defining a second mapping matrix containing structure information for correlating with the first mapping matrix for determining a configuration of the at least partially filled sub-matrices ;
and addressing a location corresponding to a column and row combination , each location corresponding to one of the at least partially filled sub-matrices , for connecting the correlator to a position within each at least partial sub-matrix .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (first mapping) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5924062A
CLAIM 2
. The memory of claim 1 , wherein the N×N correlation matrix is a 40×40 matrix computed by autocorrelation of a 40 sample weighted impulse response vector obtained from a 40 sample sub-frame from a speech signal (speech signal, decoder determines concealment) , the 40 sample weighted impulse vector having a sign vector incorporated .

US5924062A
CLAIM 18
. A method performed in a digital signal processor having a memory and correlator , the method for storing and searching an autocorrelation matrix in an EFR-ACELP codec implemented in the digital signal processor , the correlator for computing a plurality of correlation coefficients for generating the autocorrelation matrix from a 40 sample weighted impulse response signal obtained from a 40 sample subframe , the method comprising : dividing the 40 sample subframe into five tracks , each track comprising a set of eight pulse positions spaced five pulse positions apart from a preceding pulse position , each track having a unique set of eight pulse positions ;
defining a set of fifteen sub-matrices based on an autocorrelation of each track of the five tracks to itself and on an autocorrelation of each track to at least a portion of the other tracks , each sub-matrix being an 8×8 matrix ;
defining a first mapping (first non) matrix having five columns and five rows , each column comprising five at least partially filled sub-matrices of the set of fifteen sub-matrices ;
defining a second mapping matrix containing structure information for correlating with the first mapping matrix for determining a configuration of the at least partially filled sub-matrices ;
and addressing a location corresponding to a column and row combination , each location corresponding to one of the at least partially filled sub-matrices , for connecting the correlator to a position within each at least partial sub-matrix .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non (first mapping) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US5924062A
CLAIM 18
. A method performed in a digital signal processor having a memory and correlator , the method for storing and searching an autocorrelation matrix in an EFR-ACELP codec implemented in the digital signal processor , the correlator for computing a plurality of correlation coefficients for generating the autocorrelation matrix from a 40 sample weighted impulse response signal obtained from a 40 sample subframe , the method comprising : dividing the 40 sample subframe into five tracks , each track comprising a set of eight pulse positions spaced five pulse positions apart from a preceding pulse position , each track having a unique set of eight pulse positions ;
defining a set of fifteen sub-matrices based on an autocorrelation of each track of the five tracks to itself and on an autocorrelation of each track to at least a portion of the other tracks , each sub-matrix being an 8×8 matrix ;
defining a first mapping (first non) matrix having five columns and five rows , each column comprising five at least partially filled sub-matrices of the set of fifteen sub-matrices ;
defining a second mapping matrix containing structure information for correlating with the first mapping matrix for determining a configuration of the at least partially filled sub-matrices ;
and addressing a location corresponding to a column and row combination , each location corresponding to one of the at least partially filled sub-matrices , for connecting the correlator to a position within each at least partial sub-matrix .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non (first mapping) erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (T rows) = E 1 ⁢ E (T rows) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5924062A
CLAIM 1
. A memory connected to a correlator in an ACELP codec for storage of an N×N correlation matrix comprising a plurality of correlation coefficients calculated by the correlator , wherein the N×N correlation matrix is a Toeplitz-type matrix having symmetry along a main diagonal and wherein the N×N correlation matrix has an x-axis and a y-axis , the memory comprising : a plurality of tracks having a quantity T corresponding to an integral fraction of N , each track of the plurality of tracks defining a unique sub-set of N ;
a plurality of sub-matrices , each sub-matrix having N/T×N/T positions for receiving a subset of the plurality of correlation coefficients , each sub-matrix being defined by an autocorrelation of two tracks of the plurality of tracks , the two tracks comprising one of an autocorrelation of each track of the plurality of tracks to itself and an autocorrelation of each track of the plurality of tracks to at least a portion of the other tracks of the plurality of tracks ;
a plurality of mapping matrices , at least one mapping matrix containing the plurality of sub-matrices in an arrangement of T rows (⁢ E, E q) and T columns ;
and a pointer for connecting one location selected from the T rows and T columns to the correlator whereby the sub-set of the plurality of correlation coefficients is stored in the sub-matrix corresponding to the one selected location .

US5924062A
CLAIM 18
. A method performed in a digital signal processor having a memory and correlator , the method for storing and searching an autocorrelation matrix in an EFR-ACELP codec implemented in the digital signal processor , the correlator for computing a plurality of correlation coefficients for generating the autocorrelation matrix from a 40 sample weighted impulse response signal obtained from a 40 sample subframe , the method comprising : dividing the 40 sample subframe into five tracks , each track comprising a set of eight pulse positions spaced five pulse positions apart from a preceding pulse position , each track having a unique set of eight pulse positions ;
defining a set of fifteen sub-matrices based on an autocorrelation of each track of the five tracks to itself and on an autocorrelation of each track to at least a portion of the other tracks , each sub-matrix being an 8×8 matrix ;
defining a first mapping (first non) matrix having five columns and five rows , each column comprising five at least partially filled sub-matrices of the set of fifteen sub-matrices ;
defining a second mapping matrix containing structure information for correlating with the first mapping matrix for determining a configuration of the at least partially filled sub-matrices ;
and addressing a location corresponding to a column and row combination , each location corresponding to one of the at least partially filled sub-matrices , for connecting the correlator to a position within each at least partial sub-matrix .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (upper portion) for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5924062A
CLAIM 2
. The memory of claim 1 , wherein the N×N correlation matrix is a 40×40 matrix computed by autocorrelation of a 40 sample weighted impulse response vector obtained from a 40 sample sub-frame from a speech signal (speech signal, decoder determines concealment) , the 40 sample weighted impulse vector having a sign vector incorporated .

US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (upper portion) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (CELP codec) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non (first mapping) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (T rows) = E 1 ⁢ E (T rows) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5924062A
CLAIM 1
. A memory connected to a correlator in an ACELP codec (frame concealment) for storage of an N×N correlation matrix comprising a plurality of correlation coefficients calculated by the correlator , wherein the N×N correlation matrix is a Toeplitz-type matrix having symmetry along a main diagonal and wherein the N×N correlation matrix has an x-axis and a y-axis , the memory comprising : a plurality of tracks having a quantity T corresponding to an integral fraction of N , each track of the plurality of tracks defining a unique sub-set of N ;
a plurality of sub-matrices , each sub-matrix having N/T×N/T positions for receiving a subset of the plurality of correlation coefficients , each sub-matrix being defined by an autocorrelation of two tracks of the plurality of tracks , the two tracks comprising one of an autocorrelation of each track of the plurality of tracks to itself and an autocorrelation of each track of the plurality of tracks to at least a portion of the other tracks of the plurality of tracks ;
a plurality of mapping matrices , at least one mapping matrix containing the plurality of sub-matrices in an arrangement of T rows (⁢ E, E q) and T columns ;
and a pointer for connecting one location selected from the T rows and T columns to the correlator whereby the sub-set of the plurality of correlation coefficients is stored in the sub-matrix corresponding to the one selected location .

US5924062A
CLAIM 14
. The memory of claim 13 , wherein the at least one mapping function provides means for selecting one of an upper portion (signal classification parameter, signal energy) and a lower portion of a sub-matrix for storage of the correlation coefficients in the digital signal processor memory .

US5924062A
CLAIM 18
. A method performed in a digital signal processor having a memory and correlator , the method for storing and searching an autocorrelation matrix in an EFR-ACELP codec implemented in the digital signal processor , the correlator for computing a plurality of correlation coefficients for generating the autocorrelation matrix from a 40 sample weighted impulse response signal obtained from a 40 sample subframe , the method comprising : dividing the 40 sample subframe into five tracks , each track comprising a set of eight pulse positions spaced five pulse positions apart from a preceding pulse position , each track having a unique set of eight pulse positions ;
defining a set of fifteen sub-matrices based on an autocorrelation of each track of the five tracks to itself and on an autocorrelation of each track to at least a portion of the other tracks , each sub-matrix being an 8×8 matrix ;
defining a first mapping (first non) matrix having five columns and five rows , each column comprising five at least partially filled sub-matrices of the set of fifteen sub-matrices ;
defining a second mapping matrix containing structure information for correlating with the first mapping matrix for determining a configuration of the at least partially filled sub-matrices ;
and addressing a location corresponding to a column and row combination , each location corresponding to one of the at least partially filled sub-matrices , for connecting the correlator to a position within each at least partial sub-matrix .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6073092A

Filed: 1997-06-26     Issued: 2000-06-06

Method for speech coding based on a code excited linear prediction (CELP) model

(Original Assignee) Telogy Networks Inc     (Current Assignee) Google Technology Holdings LLC

Soon Y. Kwon
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch (high p, pitch p) value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6073092A
CLAIM 1
. A method for speech coding based on a code excited linear prediction (CELP) model comprising : (a) dividing speech at a sending station into discrete speech samples ;
(b) digitizing the discrete speech samples ;
(c) forming a mixed excitation function by selecting a combination of two codevectors from two fixed codebooks , each having a plurality of codevectors , and selecting a combination of two codebook gain vectors from a plurality of codebook gain vectors ;
(d) selecting an adaptive codevector from an adaptive codebook (sound signal, speech signal) , and selecting a pitch gain in combination with the mixed excitation function to represent the digitized speech ;
(e) encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and the pitch gain as a digital data stream ;
(f) sending the digital data stream from the sending station to a receiving station using transmission means ;
(g) decoding the digital data stream at the receiving station to reproduce the selected codevector , the two codebook gain vectors , the adaptive codevector , the pitch gain , and LPC filter parameters ;
(h) reproducing a digitized speech sample at the receiving station using the selected codevector , the two codebook gain vectors , adaptive codevector , the pitch gain , and the LPC filter parameters ;
(i) converting the digitized speech sample at the receiving station into an analog speech sample ;
and (j) combining a series of analog speech samples to reproduce the coded speech ;
and wherein encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and pitch gain as a digital data stream further comprises : adjusting the baseline codevector by the baseline gain and adjusting the implied codevector by the implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and subtracting the output from the linear predictive coding synthesis filter from the speech to form an input to a weighting filter .

US6073092A
CLAIM 7
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 6 wherein the encoder means further comprises : (a) high p (average pitch, E q) ass filtering the speech ;
(b) dividing the speech into frames of speech ;
(c) providing autocorrelation calculation of the frames of speech ;
(d) generating prediction coefficients from the speech samples using linear prediction coding analysis ;
(e) bandwidth expanding the prediction coefficients ;
(f) transforming the bandwidth expanded prediction coefficients into line spectrum pair frequencies ;
(g) transforming the line spectrum pair frequencies into line spectrum pair residual vectors ;
(h) split vector quantizing the line spectrum pair residual vectors ;
(i) decoding the line spectrum pair frequencies ;
(j) interpolating the line spectrum pair frequencies ;
(k) converting the line spectrum pair frequencies to linear coding prediction coefficients ;
(l) extracting pitch filter parameters from the frames of speech ;
(m) encoding the pitch filter parameters ;
and (n) extracting mixed excitation function parameters from the baseline codebook and the implied codebook .

US6073092A
CLAIM 10
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 7 wherein extracting pitch filter parameters from the frames of speech further comprises : (a) providing a zero input response ;
(b) providing a perceptual weighting filter ;
(c) subtracting the zero input response from the speech to form an input to the perceptual weighting filter ;
(d) providing a target signal , which further comprises the output from the perceptual weighting filter ;
(e) providing a weighted LPC filter ;
(f) adjusting the adaptive codevector by the adaptive gain to form an input to the weighted LPC filter ;
(g) determining the difference between the output from the weighted LPC filter and the target signal ;
(h) finding the mean squared error for all possible combinations of adaptive codevector and adaptive gain ;
and (i) selecting the adaptive codevector and adaptive gain that correlate to the minimum mean (average pitch value) squared error as the pitch filter parameters .

US6073092A
CLAIM 14
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 14 wherein post filtering the output of the LPC filter further comprises : (a) inverse filtering the output of the LPC filter with a zero filter to produce a residual signal ;
(b) operating on the residual signal outpt of the zero filter with a pitch p (average pitch, E q) ost filter ;
(c) operating on the output of the pitch post filter with an all-pole filter ;
(d) operating on the output of the all-pole filter with a tilt compensation filter to generate post-filtered speech ;
(e) operating on the output of the tilt compensation filter with a gain control to match the energy of the postfilter input ;
and (f) operating on the output of the gain control with a highpass filter to produce perceptually enhanced speech .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6073092A
CLAIM 1
. A method for speech coding based on a code excited linear prediction (CELP) model comprising : (a) dividing speech at a sending station into discrete speech samples ;
(b) digitizing the discrete speech samples ;
(c) forming a mixed excitation function by selecting a combination of two codevectors from two fixed codebooks , each having a plurality of codevectors , and selecting a combination of two codebook gain vectors from a plurality of codebook gain vectors ;
(d) selecting an adaptive codevector from an adaptive codebook (sound signal, speech signal) , and selecting a pitch gain in combination with the mixed excitation function to represent the digitized speech ;
(e) encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and the pitch gain as a digital data stream ;
(f) sending the digital data stream from the sending station to a receiving station using transmission means ;
(g) decoding the digital data stream at the receiving station to reproduce the selected codevector , the two codebook gain vectors , the adaptive codevector , the pitch gain , and LPC filter parameters ;
(h) reproducing a digitized speech sample at the receiving station using the selected codevector , the two codebook gain vectors , adaptive codevector , the pitch gain , and the LPC filter parameters ;
(i) converting the digitized speech sample at the receiving station into an analog speech sample ;
and (j) combining a series of analog speech samples to reproduce the coded speech ;
and wherein encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and pitch gain as a digital data stream further comprises : adjusting the baseline codevector by the baseline gain and adjusting the implied codevector by the implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and subtracting the output from the linear predictive coding synthesis filter from the speech to form an input to a weighting filter .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6073092A
CLAIM 1
. A method for speech coding based on a code excited linear prediction (CELP) model comprising : (a) dividing speech at a sending station into discrete speech samples ;
(b) digitizing the discrete speech samples ;
(c) forming a mixed excitation function by selecting a combination of two codevectors from two fixed codebooks , each having a plurality of codevectors , and selecting a combination of two codebook gain vectors from a plurality of codebook gain vectors ;
(d) selecting an adaptive codevector from an adaptive codebook (sound signal, speech signal) , and selecting a pitch gain in combination with the mixed excitation function to represent the digitized speech ;
(e) encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and the pitch gain as a digital data stream ;
(f) sending the digital data stream from the sending station to a receiving station using transmission means ;
(g) decoding the digital data stream at the receiving station to reproduce the selected codevector , the two codebook gain vectors , the adaptive codevector , the pitch gain , and LPC filter parameters ;
(h) reproducing a digitized speech sample at the receiving station using the selected codevector , the two codebook gain vectors , adaptive codevector , the pitch gain , and the LPC filter parameters ;
(i) converting the digitized speech sample at the receiving station into an analog speech sample ;
and (j) combining a series of analog speech samples to reproduce the coded speech ;
and wherein encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and pitch gain as a digital data stream further comprises : adjusting the baseline codevector by the baseline gain and adjusting the implied codevector by the implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and subtracting the output from the linear predictive coding synthesis filter from the speech to form an input to a weighting filter .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy (random code) for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (fixed codebook) per sample for other frames .
US6073092A
CLAIM 1
. A method for speech coding based on a code excited linear prediction (CELP) model comprising : (a) dividing speech at a sending station into discrete speech samples ;
(b) digitizing the discrete speech samples ;
(c) forming a mixed excitation function by selecting a combination of two codevectors from two fixed codebook (average energy) s , each having a plurality of codevectors , and selecting a combination of two codebook gain vectors from a plurality of codebook gain vectors ;
(d) selecting an adaptive codevector from an adaptive codebook (sound signal, speech signal) , and selecting a pitch gain in combination with the mixed excitation function to represent the digitized speech ;
(e) encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and the pitch gain as a digital data stream ;
(f) sending the digital data stream from the sending station to a receiving station using transmission means ;
(g) decoding the digital data stream at the receiving station to reproduce the selected codevector , the two codebook gain vectors , the adaptive codevector , the pitch gain , and LPC filter parameters ;
(h) reproducing a digitized speech sample at the receiving station using the selected codevector , the two codebook gain vectors , adaptive codevector , the pitch gain , and the LPC filter parameters ;
(i) converting the digitized speech sample at the receiving station into an analog speech sample ;
and (j) combining a series of analog speech samples to reproduce the coded speech ;
and wherein encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and pitch gain as a digital data stream further comprises : adjusting the baseline codevector by the baseline gain and adjusting the implied codevector by the implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and subtracting the output from the linear predictive coding synthesis filter from the speech to form an input to a weighting filter .

US6073092A
CLAIM 2
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 1 wherein the two fixed codebooks further comprise : (a) selecting the first of the combination of two codevectors from a pulse codebook with a plurality of pulse codevectors ;
and (b) selecting the second of the combination of two codevectors from a random code (signal energy) book with a plurality of random codevectors .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6073092A
CLAIM 1
. A method for speech coding based on a code excited linear prediction (CELP) model comprising : (a) dividing speech at a sending station into discrete speech samples ;
(b) digitizing the discrete speech samples ;
(c) forming a mixed excitation function by selecting a combination of two codevectors from two fixed codebooks , each having a plurality of codevectors , and selecting a combination of two codebook gain vectors from a plurality of codebook gain vectors ;
(d) selecting an adaptive codevector from an adaptive codebook (sound signal, speech signal) , and selecting a pitch gain in combination with the mixed excitation function to represent the digitized speech ;
(e) encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and the pitch gain as a digital data stream ;
(f) sending the digital data stream from the sending station to a receiving station using transmission means ;
(g) decoding the digital data stream at the receiving station to reproduce the selected codevector , the two codebook gain vectors , the adaptive codevector , the pitch gain , and LPC filter parameters ;
(h) reproducing a digitized speech sample at the receiving station using the selected codevector , the two codebook gain vectors , adaptive codevector , the pitch gain , and the LPC filter parameters ;
(i) converting the digitized speech sample at the receiving station into an analog speech sample ;
and (j) combining a series of analog speech samples to reproduce the coded speech ;
and wherein encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and pitch gain as a digital data stream further comprises : adjusting the baseline codevector by the baseline gain and adjusting the implied codevector by the implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and subtracting the output from the linear predictive coding synthesis filter from the speech to form an input to a weighting filter .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6073092A
CLAIM 1
. A method for speech coding based on a code excited linear prediction (CELP) model comprising : (a) dividing speech at a sending station into discrete speech samples ;
(b) digitizing the discrete speech samples ;
(c) forming a mixed excitation function by selecting a combination of two codevectors from two fixed codebooks , each having a plurality of codevectors , and selecting a combination of two codebook gain vectors from a plurality of codebook gain vectors ;
(d) selecting an adaptive codevector from an adaptive codebook (sound signal, speech signal) , and selecting a pitch gain in combination with the mixed excitation function to represent the digitized speech ;
(e) encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and the pitch gain as a digital data stream ;
(f) sending the digital data stream from the sending station to a receiving station using transmission means ;
(g) decoding the digital data stream at the receiving station to reproduce the selected codevector , the two codebook gain vectors , the adaptive codevector , the pitch gain , and LPC filter parameters ;
(h) reproducing a digitized speech sample at the receiving station using the selected codevector , the two codebook gain vectors , adaptive codevector , the pitch gain , and the LPC filter parameters ;
(i) converting the digitized speech sample at the receiving station into an analog speech sample ;
and (j) combining a series of analog speech samples to reproduce the coded speech ;
and wherein encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and pitch gain as a digital data stream further comprises : adjusting the baseline codevector by the baseline gain and adjusting the implied codevector by the implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and subtracting the output from the linear predictive coding synthesis filter from the speech to form an input to a weighting filter .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (filter output) and the first non erased frame received after frame erasure is encoded as active speech .
US6073092A
CLAIM 1
. A method for speech coding based on a code excited linear prediction (CELP) model comprising : (a) dividing speech at a sending station into discrete speech samples ;
(b) digitizing the discrete speech samples ;
(c) forming a mixed excitation function by selecting a combination of two codevectors from two fixed codebooks , each having a plurality of codevectors , and selecting a combination of two codebook gain vectors from a plurality of codebook gain vectors ;
(d) selecting an adaptive codevector from an adaptive codebook (sound signal, speech signal) , and selecting a pitch gain in combination with the mixed excitation function to represent the digitized speech ;
(e) encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and the pitch gain as a digital data stream ;
(f) sending the digital data stream from the sending station to a receiving station using transmission means ;
(g) decoding the digital data stream at the receiving station to reproduce the selected codevector , the two codebook gain vectors , the adaptive codevector , the pitch gain , and LPC filter parameters ;
(h) reproducing a digitized speech sample at the receiving station using the selected codevector , the two codebook gain vectors , adaptive codevector , the pitch gain , and the LPC filter parameters ;
(i) converting the digitized speech sample at the receiving station into an analog speech sample ;
and (j) combining a series of analog speech samples to reproduce the coded speech ;
and wherein encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and pitch gain as a digital data stream further comprises : adjusting the baseline codevector by the baseline gain and adjusting the implied codevector by the implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and subtracting the output from the linear predictive coding synthesis filter from the speech to form an input to a weighting filter .

US6073092A
CLAIM 12
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 6 wherein the decoder means further comprises : (a) generating the mixed excitation function from the baseline codebook and the implied codebook using the selected baseline codevector and implied codevector ;
(b) generating an input to a linear predictive coding synthesis filter from the mixed excitation function and the adaptive codebook using the selected adaptive codevector ;
(c) calculating an implied codevector from the output of the linear predictive coding synthesis filter ;
(d) providing feedback of the calculated pitch filter output (comfort noise) to the adaptive codebook ;
(e) post filtering the output from the linear predictive coding synthesis filter ;
and (f) producing a perceptually weighted speech from the post filtered output .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (encoded speech signal, inverse filtering) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6073092A
CLAIM 1
. A method for speech coding based on a code excited linear prediction (CELP) model comprising : (a) dividing speech at a sending station into discrete speech samples ;
(b) digitizing the discrete speech samples ;
(c) forming a mixed excitation function by selecting a combination of two codevectors from two fixed codebooks , each having a plurality of codevectors , and selecting a combination of two codebook gain vectors from a plurality of codebook gain vectors ;
(d) selecting an adaptive codevector from an adaptive codebook (sound signal, speech signal) , and selecting a pitch gain in combination with the mixed excitation function to represent the digitized speech ;
(e) encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and the pitch gain as a digital data stream ;
(f) sending the digital data stream from the sending station to a receiving station using transmission means ;
(g) decoding the digital data stream at the receiving station to reproduce the selected codevector , the two codebook gain vectors , the adaptive codevector , the pitch gain , and LPC filter parameters ;
(h) reproducing a digitized speech sample at the receiving station using the selected codevector , the two codebook gain vectors , adaptive codevector , the pitch gain , and the LPC filter parameters ;
(i) converting the digitized speech sample at the receiving station into an analog speech sample ;
and (j) combining a series of analog speech samples to reproduce the coded speech ;
and wherein encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and pitch gain as a digital data stream further comprises : adjusting the baseline codevector by the baseline gain and adjusting the implied codevector by the implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and subtracting the output from the linear predictive coding synthesis filter from the speech to form an input to a weighting filter .

US6073092A
CLAIM 14
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 14 wherein post filtering the output of the LPC filter further comprises : (a) inverse filtering (decoder determines concealment, LP filter, LP filter excitation signal) the output of the LPC filter with a zero filter to produce a residual signal ;
(b) operating on the residual signal outpt of the zero filter with a pitch post filter ;
(c) operating on the output of the pitch post filter with an all-pole filter ;
(d) operating on the output of the all-pole filter with a tilt compensation filter to generate post-filtered speech ;
(e) operating on the output of the tilt compensation filter with a gain control to match the energy of the postfilter input ;
and (f) operating on the output of the gain control with a highpass filter to produce perceptually enhanced speech .

US6073092A
CLAIM 15
. A method of encoding a speech signal comprising : adjusting a baseline codevector by a baseline gain and adjusting an implied codevector by an implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and producing an encoded speech signal (decoder determines concealment, LP filter, LP filter excitation signal) based on an output of the predictive coding synthesis filter .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (encoded speech signal, inverse filtering) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (high p, pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6073092A
CLAIM 7
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 6 wherein the encoder means further comprises : (a) high p (average pitch, E q) ass filtering the speech ;
(b) dividing the speech into frames of speech ;
(c) providing autocorrelation calculation of the frames of speech ;
(d) generating prediction coefficients from the speech samples using linear prediction coding analysis ;
(e) bandwidth expanding the prediction coefficients ;
(f) transforming the bandwidth expanded prediction coefficients into line spectrum pair frequencies ;
(g) transforming the line spectrum pair frequencies into line spectrum pair residual vectors ;
(h) split vector quantizing the line spectrum pair residual vectors ;
(i) decoding the line spectrum pair frequencies ;
(j) interpolating the line spectrum pair frequencies ;
(k) converting the line spectrum pair frequencies to linear coding prediction coefficients ;
(l) extracting pitch filter parameters from the frames of speech ;
(m) encoding the pitch filter parameters ;
and (n) extracting mixed excitation function parameters from the baseline codebook and the implied codebook .

US6073092A
CLAIM 14
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 14 wherein post filtering the output of the LPC filter further comprises : (a) inverse filtering (decoder determines concealment, LP filter, LP filter excitation signal) the output of the LPC filter with a zero filter to produce a residual signal ;
(b) operating on the residual signal outpt of the zero filter with a pitch p (average pitch, E q) ost filter ;
(c) operating on the output of the pitch post filter with an all-pole filter ;
(d) operating on the output of the all-pole filter with a tilt compensation filter to generate post-filtered speech ;
(e) operating on the output of the tilt compensation filter with a gain control to match the energy of the postfilter input ;
and (f) operating on the output of the gain control with a highpass filter to produce perceptually enhanced speech .

US6073092A
CLAIM 15
. A method of encoding a speech signal comprising : adjusting a baseline codevector by a baseline gain and adjusting an implied codevector by an implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and producing an encoded speech signal (decoder determines concealment, LP filter, LP filter excitation signal) based on an output of the predictive coding synthesis filter .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6073092A
CLAIM 1
. A method for speech coding based on a code excited linear prediction (CELP) model comprising : (a) dividing speech at a sending station into discrete speech samples ;
(b) digitizing the discrete speech samples ;
(c) forming a mixed excitation function by selecting a combination of two codevectors from two fixed codebooks , each having a plurality of codevectors , and selecting a combination of two codebook gain vectors from a plurality of codebook gain vectors ;
(d) selecting an adaptive codevector from an adaptive codebook (sound signal, speech signal) , and selecting a pitch gain in combination with the mixed excitation function to represent the digitized speech ;
(e) encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and the pitch gain as a digital data stream ;
(f) sending the digital data stream from the sending station to a receiving station using transmission means ;
(g) decoding the digital data stream at the receiving station to reproduce the selected codevector , the two codebook gain vectors , the adaptive codevector , the pitch gain , and LPC filter parameters ;
(h) reproducing a digitized speech sample at the receiving station using the selected codevector , the two codebook gain vectors , adaptive codevector , the pitch gain , and the LPC filter parameters ;
(i) converting the digitized speech sample at the receiving station into an analog speech sample ;
and (j) combining a series of analog speech samples to reproduce the coded speech ;
and wherein encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and pitch gain as a digital data stream further comprises : adjusting the baseline codevector by the baseline gain and adjusting the implied codevector by the implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and subtracting the output from the linear predictive coding synthesis filter from the speech to form an input to a weighting filter .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6073092A
CLAIM 1
. A method for speech coding based on a code excited linear prediction (CELP) model comprising : (a) dividing speech at a sending station into discrete speech samples ;
(b) digitizing the discrete speech samples ;
(c) forming a mixed excitation function by selecting a combination of two codevectors from two fixed codebooks , each having a plurality of codevectors , and selecting a combination of two codebook gain vectors from a plurality of codebook gain vectors ;
(d) selecting an adaptive codevector from an adaptive codebook (sound signal, speech signal) , and selecting a pitch gain in combination with the mixed excitation function to represent the digitized speech ;
(e) encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and the pitch gain as a digital data stream ;
(f) sending the digital data stream from the sending station to a receiving station using transmission means ;
(g) decoding the digital data stream at the receiving station to reproduce the selected codevector , the two codebook gain vectors , the adaptive codevector , the pitch gain , and LPC filter parameters ;
(h) reproducing a digitized speech sample at the receiving station using the selected codevector , the two codebook gain vectors , adaptive codevector , the pitch gain , and the LPC filter parameters ;
(i) converting the digitized speech sample at the receiving station into an analog speech sample ;
and (j) combining a series of analog speech samples to reproduce the coded speech ;
and wherein encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and pitch gain as a digital data stream further comprises : adjusting the baseline codevector by the baseline gain and adjusting the implied codevector by the implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and subtracting the output from the linear predictive coding synthesis filter from the speech to form an input to a weighting filter .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (encoded speech signal, inverse filtering) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (high p, pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6073092A
CLAIM 1
. A method for speech coding based on a code excited linear prediction (CELP) model comprising : (a) dividing speech at a sending station into discrete speech samples ;
(b) digitizing the discrete speech samples ;
(c) forming a mixed excitation function by selecting a combination of two codevectors from two fixed codebooks , each having a plurality of codevectors , and selecting a combination of two codebook gain vectors from a plurality of codebook gain vectors ;
(d) selecting an adaptive codevector from an adaptive codebook (sound signal, speech signal) , and selecting a pitch gain in combination with the mixed excitation function to represent the digitized speech ;
(e) encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and the pitch gain as a digital data stream ;
(f) sending the digital data stream from the sending station to a receiving station using transmission means ;
(g) decoding the digital data stream at the receiving station to reproduce the selected codevector , the two codebook gain vectors , the adaptive codevector , the pitch gain , and LPC filter parameters ;
(h) reproducing a digitized speech sample at the receiving station using the selected codevector , the two codebook gain vectors , adaptive codevector , the pitch gain , and the LPC filter parameters ;
(i) converting the digitized speech sample at the receiving station into an analog speech sample ;
and (j) combining a series of analog speech samples to reproduce the coded speech ;
and wherein encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and pitch gain as a digital data stream further comprises : adjusting the baseline codevector by the baseline gain and adjusting the implied codevector by the implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and subtracting the output from the linear predictive coding synthesis filter from the speech to form an input to a weighting filter .

US6073092A
CLAIM 7
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 6 wherein the encoder means further comprises : (a) high p (average pitch, E q) ass filtering the speech ;
(b) dividing the speech into frames of speech ;
(c) providing autocorrelation calculation of the frames of speech ;
(d) generating prediction coefficients from the speech samples using linear prediction coding analysis ;
(e) bandwidth expanding the prediction coefficients ;
(f) transforming the bandwidth expanded prediction coefficients into line spectrum pair frequencies ;
(g) transforming the line spectrum pair frequencies into line spectrum pair residual vectors ;
(h) split vector quantizing the line spectrum pair residual vectors ;
(i) decoding the line spectrum pair frequencies ;
(j) interpolating the line spectrum pair frequencies ;
(k) converting the line spectrum pair frequencies to linear coding prediction coefficients ;
(l) extracting pitch filter parameters from the frames of speech ;
(m) encoding the pitch filter parameters ;
and (n) extracting mixed excitation function parameters from the baseline codebook and the implied codebook .

US6073092A
CLAIM 14
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 14 wherein post filtering the output of the LPC filter further comprises : (a) inverse filtering (decoder determines concealment, LP filter, LP filter excitation signal) the output of the LPC filter with a zero filter to produce a residual signal ;
(b) operating on the residual signal outpt of the zero filter with a pitch p (average pitch, E q) ost filter ;
(c) operating on the output of the pitch post filter with an all-pole filter ;
(d) operating on the output of the all-pole filter with a tilt compensation filter to generate post-filtered speech ;
(e) operating on the output of the tilt compensation filter with a gain control to match the energy of the postfilter input ;
and (f) operating on the output of the gain control with a highpass filter to produce perceptually enhanced speech .

US6073092A
CLAIM 15
. A method of encoding a speech signal comprising : adjusting a baseline codevector by a baseline gain and adjusting an implied codevector by an implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and producing an encoded speech signal (decoder determines concealment, LP filter, LP filter excitation signal) based on an output of the predictive coding synthesis filter .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch (high p, pitch p) value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6073092A
CLAIM 1
. A method for speech coding based on a code excited linear prediction (CELP) model comprising : (a) dividing speech at a sending station into discrete speech samples ;
(b) digitizing the discrete speech samples ;
(c) forming a mixed excitation function by selecting a combination of two codevectors from two fixed codebooks , each having a plurality of codevectors , and selecting a combination of two codebook gain vectors from a plurality of codebook gain vectors ;
(d) selecting an adaptive codevector from an adaptive codebook (sound signal, speech signal) , and selecting a pitch gain in combination with the mixed excitation function to represent the digitized speech ;
(e) encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and the pitch gain as a digital data stream ;
(f) sending the digital data stream from the sending station to a receiving station using transmission means ;
(g) decoding the digital data stream at the receiving station to reproduce the selected codevector , the two codebook gain vectors , the adaptive codevector , the pitch gain , and LPC filter parameters ;
(h) reproducing a digitized speech sample at the receiving station using the selected codevector , the two codebook gain vectors , adaptive codevector , the pitch gain , and the LPC filter parameters ;
(i) converting the digitized speech sample at the receiving station into an analog speech sample ;
and (j) combining a series of analog speech samples to reproduce the coded speech ;
and wherein encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and pitch gain as a digital data stream further comprises : adjusting the baseline codevector by the baseline gain and adjusting the implied codevector by the implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and subtracting the output from the linear predictive coding synthesis filter from the speech to form an input to a weighting filter .

US6073092A
CLAIM 7
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 6 wherein the encoder means further comprises : (a) high p (average pitch, E q) ass filtering the speech ;
(b) dividing the speech into frames of speech ;
(c) providing autocorrelation calculation of the frames of speech ;
(d) generating prediction coefficients from the speech samples using linear prediction coding analysis ;
(e) bandwidth expanding the prediction coefficients ;
(f) transforming the bandwidth expanded prediction coefficients into line spectrum pair frequencies ;
(g) transforming the line spectrum pair frequencies into line spectrum pair residual vectors ;
(h) split vector quantizing the line spectrum pair residual vectors ;
(i) decoding the line spectrum pair frequencies ;
(j) interpolating the line spectrum pair frequencies ;
(k) converting the line spectrum pair frequencies to linear coding prediction coefficients ;
(l) extracting pitch filter parameters from the frames of speech ;
(m) encoding the pitch filter parameters ;
and (n) extracting mixed excitation function parameters from the baseline codebook and the implied codebook .

US6073092A
CLAIM 10
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 7 wherein extracting pitch filter parameters from the frames of speech further comprises : (a) providing a zero input response ;
(b) providing a perceptual weighting filter ;
(c) subtracting the zero input response from the speech to form an input to the perceptual weighting filter ;
(d) providing a target signal , which further comprises the output from the perceptual weighting filter ;
(e) providing a weighted LPC filter ;
(f) adjusting the adaptive codevector by the adaptive gain to form an input to the weighted LPC filter ;
(g) determining the difference between the output from the weighted LPC filter and the target signal ;
(h) finding the mean squared error for all possible combinations of adaptive codevector and adaptive gain ;
and (i) selecting the adaptive codevector and adaptive gain that correlate to the minimum mean (average pitch value) squared error as the pitch filter parameters .

US6073092A
CLAIM 14
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 14 wherein post filtering the output of the LPC filter further comprises : (a) inverse filtering the output of the LPC filter with a zero filter to produce a residual signal ;
(b) operating on the residual signal outpt of the zero filter with a pitch p (average pitch, E q) ost filter ;
(c) operating on the output of the pitch post filter with an all-pole filter ;
(d) operating on the output of the all-pole filter with a tilt compensation filter to generate post-filtered speech ;
(e) operating on the output of the tilt compensation filter with a gain control to match the energy of the postfilter input ;
and (f) operating on the output of the gain control with a highpass filter to produce perceptually enhanced speech .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6073092A
CLAIM 1
. A method for speech coding based on a code excited linear prediction (CELP) model comprising : (a) dividing speech at a sending station into discrete speech samples ;
(b) digitizing the discrete speech samples ;
(c) forming a mixed excitation function by selecting a combination of two codevectors from two fixed codebooks , each having a plurality of codevectors , and selecting a combination of two codebook gain vectors from a plurality of codebook gain vectors ;
(d) selecting an adaptive codevector from an adaptive codebook (sound signal, speech signal) , and selecting a pitch gain in combination with the mixed excitation function to represent the digitized speech ;
(e) encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and the pitch gain as a digital data stream ;
(f) sending the digital data stream from the sending station to a receiving station using transmission means ;
(g) decoding the digital data stream at the receiving station to reproduce the selected codevector , the two codebook gain vectors , the adaptive codevector , the pitch gain , and LPC filter parameters ;
(h) reproducing a digitized speech sample at the receiving station using the selected codevector , the two codebook gain vectors , adaptive codevector , the pitch gain , and the LPC filter parameters ;
(i) converting the digitized speech sample at the receiving station into an analog speech sample ;
and (j) combining a series of analog speech samples to reproduce the coded speech ;
and wherein encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and pitch gain as a digital data stream further comprises : adjusting the baseline codevector by the baseline gain and adjusting the implied codevector by the implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and subtracting the output from the linear predictive coding synthesis filter from the speech to form an input to a weighting filter .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6073092A
CLAIM 1
. A method for speech coding based on a code excited linear prediction (CELP) model comprising : (a) dividing speech at a sending station into discrete speech samples ;
(b) digitizing the discrete speech samples ;
(c) forming a mixed excitation function by selecting a combination of two codevectors from two fixed codebooks , each having a plurality of codevectors , and selecting a combination of two codebook gain vectors from a plurality of codebook gain vectors ;
(d) selecting an adaptive codevector from an adaptive codebook (sound signal, speech signal) , and selecting a pitch gain in combination with the mixed excitation function to represent the digitized speech ;
(e) encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and the pitch gain as a digital data stream ;
(f) sending the digital data stream from the sending station to a receiving station using transmission means ;
(g) decoding the digital data stream at the receiving station to reproduce the selected codevector , the two codebook gain vectors , the adaptive codevector , the pitch gain , and LPC filter parameters ;
(h) reproducing a digitized speech sample at the receiving station using the selected codevector , the two codebook gain vectors , adaptive codevector , the pitch gain , and the LPC filter parameters ;
(i) converting the digitized speech sample at the receiving station into an analog speech sample ;
and (j) combining a series of analog speech samples to reproduce the coded speech ;
and wherein encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and pitch gain as a digital data stream further comprises : adjusting the baseline codevector by the baseline gain and adjusting the implied codevector by the implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and subtracting the output from the linear predictive coding synthesis filter from the speech to form an input to a weighting filter .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (random code) for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US6073092A
CLAIM 1
. A method for speech coding based on a code excited linear prediction (CELP) model comprising : (a) dividing speech at a sending station into discrete speech samples ;
(b) digitizing the discrete speech samples ;
(c) forming a mixed excitation function by selecting a combination of two codevectors from two fixed codebook (average energy) s , each having a plurality of codevectors , and selecting a combination of two codebook gain vectors from a plurality of codebook gain vectors ;
(d) selecting an adaptive codevector from an adaptive codebook (sound signal, speech signal) , and selecting a pitch gain in combination with the mixed excitation function to represent the digitized speech ;
(e) encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and the pitch gain as a digital data stream ;
(f) sending the digital data stream from the sending station to a receiving station using transmission means ;
(g) decoding the digital data stream at the receiving station to reproduce the selected codevector , the two codebook gain vectors , the adaptive codevector , the pitch gain , and LPC filter parameters ;
(h) reproducing a digitized speech sample at the receiving station using the selected codevector , the two codebook gain vectors , adaptive codevector , the pitch gain , and the LPC filter parameters ;
(i) converting the digitized speech sample at the receiving station into an analog speech sample ;
and (j) combining a series of analog speech samples to reproduce the coded speech ;
and wherein encoding one of the two selected codevectors , both of the selected codebook gain vectors , the adaptive codevector and pitch gain as a digital data stream further comprises : adjusting the baseline codevector by the baseline gain and adjusting the implied codevector by the implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and subtracting the output from the linear predictive coding synthesis filter from the speech to form an input to a weighting filter .

US6073092A
CLAIM 2
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 1 wherein the two fixed codebooks further comprise : (a) selecting the first of the combination of two codevectors from a pulse codebook with a plurality of pulse codevectors ;
and (b) selecting the second of the combination of two codevectors from a random code (signal energy) book with a plurality of random codevectors .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6073092A
CLAIM 5
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 1 further comprising : (a) representing the plurality of codevectors with a codebook index ;
and (b) representing the adaptive codevector with an adaptive codebook (sound signal, speech signal) index , wherein the indices and codebook gain vectors are encoded as the digital data stream .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6073092A
CLAIM 5
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 1 further comprising : (a) representing the plurality of codevectors with a codebook index ;
and (b) representing the adaptive codevector with an adaptive codebook (sound signal, speech signal) index , wherein the indices and codebook gain vectors are encoded as the digital data stream .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (filter output) and the first non erased frame received after frame erasure is encoded as active speech .
US6073092A
CLAIM 5
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 1 further comprising : (a) representing the plurality of codevectors with a codebook index ;
and (b) representing the adaptive codevector with an adaptive codebook (sound signal, speech signal) index , wherein the indices and codebook gain vectors are encoded as the digital data stream .

US6073092A
CLAIM 12
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 6 wherein the decoder means further comprises : (a) generating the mixed excitation function from the baseline codebook and the implied codebook using the selected baseline codevector and implied codevector ;
(b) generating an input to a linear predictive coding synthesis filter from the mixed excitation function and the adaptive codebook using the selected adaptive codevector ;
(c) calculating an implied codevector from the output of the linear predictive coding synthesis filter ;
(d) providing feedback of the calculated pitch filter output (comfort noise) to the adaptive codebook ;
(e) post filtering the output from the linear predictive coding synthesis filter ;
and (f) producing a perceptually weighted speech from the post filtered output .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (encoded speech signal, inverse filtering) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6073092A
CLAIM 5
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 1 further comprising : (a) representing the plurality of codevectors with a codebook index ;
and (b) representing the adaptive codevector with an adaptive codebook (sound signal, speech signal) index , wherein the indices and codebook gain vectors are encoded as the digital data stream .

US6073092A
CLAIM 14
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 14 wherein post filtering the output of the LPC filter further comprises : (a) inverse filtering (decoder determines concealment, LP filter, LP filter excitation signal) the output of the LPC filter with a zero filter to produce a residual signal ;
(b) operating on the residual signal outpt of the zero filter with a pitch post filter ;
(c) operating on the output of the pitch post filter with an all-pole filter ;
(d) operating on the output of the all-pole filter with a tilt compensation filter to generate post-filtered speech ;
(e) operating on the output of the tilt compensation filter with a gain control to match the energy of the postfilter input ;
and (f) operating on the output of the gain control with a highpass filter to produce perceptually enhanced speech .

US6073092A
CLAIM 15
. A method of encoding a speech signal comprising : adjusting a baseline codevector by a baseline gain and adjusting an implied codevector by an implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and producing an encoded speech signal (decoder determines concealment, LP filter, LP filter excitation signal) based on an output of the predictive coding synthesis filter .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (encoded speech signal, inverse filtering) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (high p, pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6073092A
CLAIM 7
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 6 wherein the encoder means further comprises : (a) high p (average pitch, E q) ass filtering the speech ;
(b) dividing the speech into frames of speech ;
(c) providing autocorrelation calculation of the frames of speech ;
(d) generating prediction coefficients from the speech samples using linear prediction coding analysis ;
(e) bandwidth expanding the prediction coefficients ;
(f) transforming the bandwidth expanded prediction coefficients into line spectrum pair frequencies ;
(g) transforming the line spectrum pair frequencies into line spectrum pair residual vectors ;
(h) split vector quantizing the line spectrum pair residual vectors ;
(i) decoding the line spectrum pair frequencies ;
(j) interpolating the line spectrum pair frequencies ;
(k) converting the line spectrum pair frequencies to linear coding prediction coefficients ;
(l) extracting pitch filter parameters from the frames of speech ;
(m) encoding the pitch filter parameters ;
and (n) extracting mixed excitation function parameters from the baseline codebook and the implied codebook .

US6073092A
CLAIM 14
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 14 wherein post filtering the output of the LPC filter further comprises : (a) inverse filtering (decoder determines concealment, LP filter, LP filter excitation signal) the output of the LPC filter with a zero filter to produce a residual signal ;
(b) operating on the residual signal outpt of the zero filter with a pitch p (average pitch, E q) ost filter ;
(c) operating on the output of the pitch post filter with an all-pole filter ;
(d) operating on the output of the all-pole filter with a tilt compensation filter to generate post-filtered speech ;
(e) operating on the output of the tilt compensation filter with a gain control to match the energy of the postfilter input ;
and (f) operating on the output of the gain control with a highpass filter to produce perceptually enhanced speech .

US6073092A
CLAIM 15
. A method of encoding a speech signal comprising : adjusting a baseline codevector by a baseline gain and adjusting an implied codevector by an implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and producing an encoded speech signal (decoder determines concealment, LP filter, LP filter excitation signal) based on an output of the predictive coding synthesis filter .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6073092A
CLAIM 5
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 1 further comprising : (a) representing the plurality of codevectors with a codebook index ;
and (b) representing the adaptive codevector with an adaptive codebook (sound signal, speech signal) index , wherein the indices and codebook gain vectors are encoded as the digital data stream .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6073092A
CLAIM 5
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 1 further comprising : (a) representing the plurality of codevectors with a codebook index ;
and (b) representing the adaptive codevector with an adaptive codebook (sound signal, speech signal) index , wherein the indices and codebook gain vectors are encoded as the digital data stream .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (random code) for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US6073092A
CLAIM 2
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 1 wherein the two fixed codebook (average energy) s further comprise : (a) selecting the first of the combination of two codevectors from a pulse codebook with a plurality of pulse codevectors ;
and (b) selecting the second of the combination of two codevectors from a random code (signal energy) book with a plurality of random codevectors .

US6073092A
CLAIM 5
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 1 further comprising : (a) representing the plurality of codevectors with a codebook index ;
and (b) representing the adaptive codevector with an adaptive codebook (sound signal, speech signal) index , wherein the indices and codebook gain vectors are encoded as the digital data stream .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (encoded speech signal, inverse filtering) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (high p, pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6073092A
CLAIM 5
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 1 further comprising : (a) representing the plurality of codevectors with a codebook index ;
and (b) representing the adaptive codevector with an adaptive codebook (sound signal, speech signal) index , wherein the indices and codebook gain vectors are encoded as the digital data stream .

US6073092A
CLAIM 7
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 6 wherein the encoder means further comprises : (a) high p (average pitch, E q) ass filtering the speech ;
(b) dividing the speech into frames of speech ;
(c) providing autocorrelation calculation of the frames of speech ;
(d) generating prediction coefficients from the speech samples using linear prediction coding analysis ;
(e) bandwidth expanding the prediction coefficients ;
(f) transforming the bandwidth expanded prediction coefficients into line spectrum pair frequencies ;
(g) transforming the line spectrum pair frequencies into line spectrum pair residual vectors ;
(h) split vector quantizing the line spectrum pair residual vectors ;
(i) decoding the line spectrum pair frequencies ;
(j) interpolating the line spectrum pair frequencies ;
(k) converting the line spectrum pair frequencies to linear coding prediction coefficients ;
(l) extracting pitch filter parameters from the frames of speech ;
(m) encoding the pitch filter parameters ;
and (n) extracting mixed excitation function parameters from the baseline codebook and the implied codebook .

US6073092A
CLAIM 14
. The method for speech coding based on a code excited linear prediction (CELP) model of claim 14 wherein post filtering the output of the LPC filter further comprises : (a) inverse filtering (decoder determines concealment, LP filter, LP filter excitation signal) the output of the LPC filter with a zero filter to produce a residual signal ;
(b) operating on the residual signal outpt of the zero filter with a pitch p (average pitch, E q) ost filter ;
(c) operating on the output of the pitch post filter with an all-pole filter ;
(d) operating on the output of the all-pole filter with a tilt compensation filter to generate post-filtered speech ;
(e) operating on the output of the tilt compensation filter with a gain control to match the energy of the postfilter input ;
and (f) operating on the output of the gain control with a highpass filter to produce perceptually enhanced speech .

US6073092A
CLAIM 15
. A method of encoding a speech signal comprising : adjusting a baseline codevector by a baseline gain and adjusting an implied codevector by an implied gain to form a mixed excitation function ;
using the mixed excitation function as an input to a pitch filter ;
using the output of the pitch filter as an input of a linear predictive coding synthesis filter ;
and producing an encoded speech signal (decoder determines concealment, LP filter, LP filter excitation signal) based on an output of the predictive coding synthesis filter .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5966689A

Filed: 1997-06-18     Issued: 1999-10-12

Adaptive filter and filtering method for low bit rate coding

(Original Assignee) Texas Instruments Inc     (Current Assignee) Texas Instruments Inc

Alan V. McCree
US7693710B2
CLAIM 1
. A method of concealing frame erasure (predicted value) caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (said signals) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (unit delay, second filtering, first filter) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response (unit delay, second filtering, first filter) up to the end of a last subframe affected by the artificial construction of the periodic part .
US5966689A
CLAIM 1
. A filtering method for improving digitally processed speech signals ;
generating a signal probability estimator value based on a comparison of signal power of said signals (onset frame) in a current frame to a long term estimate of noise power ;
first filter (first impulse, impulse response, pass filter, first impulse response) ing said signals wherein the filtering is controlled by linear predictive coefficients and said signal probability value ;
and second filtering (first impulse, impulse response, pass filter, first impulse response) by the transfer function of the form 1-μz -1 * signal probability value where μ is a scaling factor and z -1 is a unit delay (first impulse, impulse response, pass filter, first impulse response) operator .

US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 31
. The filter of claim 30 wherein said first filter has a transfer function of ##EQU8## where P is the predicted value (concealing frame erasure) , α and β are scaling factors , z is the inverse of the unit delay z -1 , and μ is a scaling factor .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (predicted value) caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 31
. The filter of claim 30 wherein said first filter has a transfer function of ##EQU8## where P is the predicted value (concealing frame erasure) , α and β are scaling factors , z is the inverse of the unit delay z -1 , and μ is a scaling factor .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (predicted value) caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (said system) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 26
. The system of claim 18 wherein said system (maximum amplitude) is a MELP coder .

US5966689A
CLAIM 31
. The filter of claim 30 wherein said first filter has a transfer function of ##EQU8## where P is the predicted value (concealing frame erasure) , α and β are scaling factors , z is the inverse of the unit delay z -1 , and μ is a scaling factor .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (predicted value) caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (successive samples, current frame, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy (digital signals) for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5966689A
CLAIM 1
. A filtering method for improving digitally processed speech signal (current frame, decoder determines concealment, speech signal) s ;
generating a signal probability estimator value based on a comparison of signal power of said signals in a current frame (current frame, decoder determines concealment, speech signal) to a long term estimate of noise power ;
first filtering said signals wherein the filtering is controlled by linear predictive coefficients and said signal probability value ;
and second filtering by the transfer function of the form 1-μz -1 * signal probability value where μ is a scaling factor and z -1 is a unit delay operator .

US5966689A
CLAIM 10
. A filtering method for enhancing digitally processed speech or audio signals comprising the steps of : buffering said speech or audio signals into frames of vectors , each vector having K successive samples (current frame, decoder determines concealment, speech signal) ;
performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute linear predictive coefficients and signal power in the current frame ;
generating a signal probability estimator value sig-prob based on comparison of the signal power in the current frame to a long term estimate of the noise power ;
first filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU6## where 1-P is the LPC coefficient , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors * sig-prob ;
and second filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 31
. The filter of claim 30 wherein said first filter has a transfer function of ##EQU8## where P is the predicted value (concealing frame erasure) , α and β are scaling factors , z is the inverse of the unit delay z -1 , and μ is a scaling factor .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (predicted value) caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 31
. The filter of claim 30 wherein said first filter has a transfer function of ##EQU8## where P is the predicted value (concealing frame erasure) , α and β are scaling factors , z is the inverse of the unit delay z -1 , and μ is a scaling factor .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (digital signals) is a speech signal (successive samples, current frame, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5966689A
CLAIM 1
. A filtering method for improving digitally processed speech signal (current frame, decoder determines concealment, speech signal) s ;
generating a signal probability estimator value based on a comparison of signal power of said signals in a current frame (current frame, decoder determines concealment, speech signal) to a long term estimate of noise power ;
first filtering said signals wherein the filtering is controlled by linear predictive coefficients and said signal probability value ;
and second filtering by the transfer function of the form 1-μz -1 * signal probability value where μ is a scaling factor and z -1 is a unit delay operator .

US5966689A
CLAIM 10
. A filtering method for enhancing digitally processed speech or audio signals comprising the steps of : buffering said speech or audio signals into frames of vectors , each vector having K successive samples (current frame, decoder determines concealment, speech signal) ;
performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute linear predictive coefficients and signal power in the current frame ;
generating a signal probability estimator value sig-prob based on comparison of the signal power in the current frame to a long term estimate of the noise power ;
first filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU6## where 1-P is the LPC coefficient , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors * sig-prob ;
and second filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (digital signals) is a speech signal (successive samples, current frame, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5966689A
CLAIM 1
. A filtering method for improving digitally processed speech signal (current frame, decoder determines concealment, speech signal) s ;
generating a signal probability estimator value based on a comparison of signal power of said signals in a current frame (current frame, decoder determines concealment, speech signal) to a long term estimate of noise power ;
first filtering said signals wherein the filtering is controlled by linear predictive coefficients and said signal probability value ;
and second filtering by the transfer function of the form 1-μz -1 * signal probability value where μ is a scaling factor and z -1 is a unit delay operator .

US5966689A
CLAIM 10
. A filtering method for enhancing digitally processed speech or audio signals comprising the steps of : buffering said speech or audio signals into frames of vectors , each vector having K successive samples (current frame, decoder determines concealment, speech signal) ;
performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute linear predictive coefficients and signal power in the current frame ;
generating a signal probability estimator value sig-prob based on comparison of the signal power in the current frame to a long term estimate of the noise power ;
first filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU6## where 1-P is the LPC coefficient , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors * sig-prob ;
and second filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (predicted value) caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 31
. The filter of claim 30 wherein said first filter has a transfer function of ##EQU8## where P is the predicted value (concealing frame erasure) , α and β are scaling factors , z is the inverse of the unit delay z -1 , and μ is a scaling factor .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (successive samples, current frame, speech signal) , E LPO is an energy of an impulse response (unit delay, second filtering, first filter) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5966689A
CLAIM 1
. A filtering method for improving digitally processed speech signal (current frame, decoder determines concealment, speech signal) s ;
generating a signal probability estimator value based on a comparison of signal power of said signals in a current frame (current frame, decoder determines concealment, speech signal) to a long term estimate of noise power ;
first filter (first impulse, impulse response, pass filter, first impulse response) ing said signals wherein the filtering is controlled by linear predictive coefficients and said signal probability value ;
and second filtering (first impulse, impulse response, pass filter, first impulse response) by the transfer function of the form 1-μz -1 * signal probability value where μ is a scaling factor and z -1 is a unit delay (first impulse, impulse response, pass filter, first impulse response) operator .

US5966689A
CLAIM 10
. A filtering method for enhancing digitally processed speech or audio signals comprising the steps of : buffering said speech or audio signals into frames of vectors , each vector having K successive samples (current frame, decoder determines concealment, speech signal) ;
performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute linear predictive coefficients and signal power in the current frame ;
generating a signal probability estimator value sig-prob based on comparison of the signal power in the current frame to a long term estimate of the noise power ;
first filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU6## where 1-P is the LPC coefficient , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors * sig-prob ;
and second filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (predicted value) caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 31
. The filter of claim 30 wherein said first filter has a transfer function of ##EQU8## where P is the predicted value (concealing frame erasure) , α and β are scaling factors , z is the inverse of the unit delay z -1 , and μ is a scaling factor .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (predicted value) caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (said system) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 26
. The system of claim 18 wherein said system (maximum amplitude) is a MELP coder .

US5966689A
CLAIM 31
. The filter of claim 30 wherein said first filter has a transfer function of ##EQU8## where P is the predicted value (concealing frame erasure) , α and β are scaling factors , z is the inverse of the unit delay z -1 , and μ is a scaling factor .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (digital signals) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (successive samples, current frame, speech signal) , E LPO is an energy of an impulse response (unit delay, second filtering, first filter) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5966689A
CLAIM 1
. A filtering method for improving digitally processed speech signal (current frame, decoder determines concealment, speech signal) s ;
generating a signal probability estimator value based on a comparison of signal power of said signals in a current frame (current frame, decoder determines concealment, speech signal) to a long term estimate of noise power ;
first filter (first impulse, impulse response, pass filter, first impulse response) ing said signals wherein the filtering is controlled by linear predictive coefficients and said signal probability value ;
and second filtering (first impulse, impulse response, pass filter, first impulse response) by the transfer function of the form 1-μz -1 * signal probability value where μ is a scaling factor and z -1 is a unit delay (first impulse, impulse response, pass filter, first impulse response) operator .

US5966689A
CLAIM 10
. A filtering method for enhancing digitally processed speech or audio signals comprising the steps of : buffering said speech or audio signals into frames of vectors , each vector having K successive samples (current frame, decoder determines concealment, speech signal) ;
performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute linear predictive coefficients and signal power in the current frame ;
generating a signal probability estimator value sig-prob based on comparison of the signal power in the current frame to a long term estimate of the noise power ;
first filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU6## where 1-P is the LPC coefficient , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors * sig-prob ;
and second filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (said signals) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (unit delay, second filtering, first filter) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response (unit delay, second filtering, first filter) up to an end of a last subframe affected by the artificial construction of the periodic part .
US5966689A
CLAIM 1
. A filtering method for improving digitally processed speech signals ;
generating a signal probability estimator value based on a comparison of signal power of said signals (onset frame) in a current frame to a long term estimate of noise power ;
first filter (first impulse, impulse response, pass filter, first impulse response) ing said signals wherein the filtering is controlled by linear predictive coefficients and said signal probability value ;
and second filtering (first impulse, impulse response, pass filter, first impulse response) by the transfer function of the form 1-μz -1 * signal probability value where μ is a scaling factor and z -1 is a unit delay (first impulse, impulse response, pass filter, first impulse response) operator .

US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (said system) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 26
. The system of claim 18 wherein said system (maximum amplitude) is a MELP coder .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (successive samples, current frame, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (digital signals) for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5966689A
CLAIM 1
. A filtering method for improving digitally processed speech signal (current frame, decoder determines concealment, speech signal) s ;
generating a signal probability estimator value based on a comparison of signal power of said signals in a current frame (current frame, decoder determines concealment, speech signal) to a long term estimate of noise power ;
first filtering said signals wherein the filtering is controlled by linear predictive coefficients and said signal probability value ;
and second filtering by the transfer function of the form 1-μz -1 * signal probability value where μ is a scaling factor and z -1 is a unit delay operator .

US5966689A
CLAIM 10
. A filtering method for enhancing digitally processed speech or audio signals comprising the steps of : buffering said speech or audio signals into frames of vectors , each vector having K successive samples (current frame, decoder determines concealment, speech signal) ;
performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute linear predictive coefficients and signal power in the current frame ;
generating a signal probability estimator value sig-prob based on comparison of the signal power in the current frame to a long term estimate of the noise power ;
first filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU6## where 1-P is the LPC coefficient , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors * sig-prob ;
and second filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (digital signals) is a speech signal (successive samples, current frame, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5966689A
CLAIM 1
. A filtering method for improving digitally processed speech signal (current frame, decoder determines concealment, speech signal) s ;
generating a signal probability estimator value based on a comparison of signal power of said signals in a current frame (current frame, decoder determines concealment, speech signal) to a long term estimate of noise power ;
first filtering said signals wherein the filtering is controlled by linear predictive coefficients and said signal probability value ;
and second filtering by the transfer function of the form 1-μz -1 * signal probability value where μ is a scaling factor and z -1 is a unit delay operator .

US5966689A
CLAIM 10
. A filtering method for enhancing digitally processed speech or audio signals comprising the steps of : buffering said speech or audio signals into frames of vectors , each vector having K successive samples (current frame, decoder determines concealment, speech signal) ;
performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute linear predictive coefficients and signal power in the current frame ;
generating a signal probability estimator value sig-prob based on comparison of the signal power in the current frame to a long term estimate of the noise power ;
first filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU6## where 1-P is the LPC coefficient , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors * sig-prob ;
and second filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (digital signals) is a speech signal (successive samples, current frame, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5966689A
CLAIM 1
. A filtering method for improving digitally processed speech signal (current frame, decoder determines concealment, speech signal) s ;
generating a signal probability estimator value based on a comparison of signal power of said signals in a current frame (current frame, decoder determines concealment, speech signal) to a long term estimate of noise power ;
first filtering said signals wherein the filtering is controlled by linear predictive coefficients and said signal probability value ;
and second filtering by the transfer function of the form 1-μz -1 * signal probability value where μ is a scaling factor and z -1 is a unit delay operator .

US5966689A
CLAIM 10
. A filtering method for enhancing digitally processed speech or audio signals comprising the steps of : buffering said speech or audio signals into frames of vectors , each vector having K successive samples (current frame, decoder determines concealment, speech signal) ;
performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute linear predictive coefficients and signal power in the current frame ;
generating a signal probability estimator value sig-prob based on comparison of the signal power in the current frame to a long term estimate of the noise power ;
first filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU6## where 1-P is the LPC coefficient , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors * sig-prob ;
and second filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (successive samples, current frame, speech signal) , E LPO is an energy of an impulse response (unit delay, second filtering, first filter) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5966689A
CLAIM 1
. A filtering method for improving digitally processed speech signal (current frame, decoder determines concealment, speech signal) s ;
generating a signal probability estimator value based on a comparison of signal power of said signals in a current frame (current frame, decoder determines concealment, speech signal) to a long term estimate of noise power ;
first filter (first impulse, impulse response, pass filter, first impulse response) ing said signals wherein the filtering is controlled by linear predictive coefficients and said signal probability value ;
and second filtering (first impulse, impulse response, pass filter, first impulse response) by the transfer function of the form 1-μz -1 * signal probability value where μ is a scaling factor and z -1 is a unit delay (first impulse, impulse response, pass filter, first impulse response) operator .

US5966689A
CLAIM 10
. A filtering method for enhancing digitally processed speech or audio signals comprising the steps of : buffering said speech or audio signals into frames of vectors , each vector having K successive samples (current frame, decoder determines concealment, speech signal) ;
performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute linear predictive coefficients and signal power in the current frame ;
generating a signal probability estimator value sig-prob based on comparison of the signal power in the current frame to a long term estimate of the noise power ;
first filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU6## where 1-P is the LPC coefficient , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors * sig-prob ;
and second filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (said system) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 26
. The system of claim 18 wherein said system (maximum amplitude) is a MELP coder .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (successive samples, current frame, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (digital signals) for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5966689A
CLAIM 1
. A filtering method for improving digitally processed speech signal (current frame, decoder determines concealment, speech signal) s ;
generating a signal probability estimator value based on a comparison of signal power of said signals in a current frame (current frame, decoder determines concealment, speech signal) to a long term estimate of noise power ;
first filtering said signals wherein the filtering is controlled by linear predictive coefficients and said signal probability value ;
and second filtering by the transfer function of the form 1-μz -1 * signal probability value where μ is a scaling factor and z -1 is a unit delay operator .

US5966689A
CLAIM 10
. A filtering method for enhancing digitally processed speech or audio signals comprising the steps of : buffering said speech or audio signals into frames of vectors , each vector having K successive samples (current frame, decoder determines concealment, speech signal) ;
performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute linear predictive coefficients and signal power in the current frame ;
generating a signal probability estimator value sig-prob based on comparison of the signal power in the current frame to a long term estimate of the noise power ;
first filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU6## where 1-P is the LPC coefficient , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors * sig-prob ;
and second filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (digital signals) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (successive samples, current frame, speech signal) , E LPO is an energy of an impulse response (unit delay, second filtering, first filter) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5966689A
CLAIM 1
. A filtering method for improving digitally processed speech signal (current frame, decoder determines concealment, speech signal) s ;
generating a signal probability estimator value based on a comparison of signal power of said signals in a current frame (current frame, decoder determines concealment, speech signal) to a long term estimate of noise power ;
first filter (first impulse, impulse response, pass filter, first impulse response) ing said signals wherein the filtering is controlled by linear predictive coefficients and said signal probability value ;
and second filtering (first impulse, impulse response, pass filter, first impulse response) by the transfer function of the form 1-μz -1 * signal probability value where μ is a scaling factor and z -1 is a unit delay (first impulse, impulse response, pass filter, first impulse response) operator .

US5966689A
CLAIM 10
. A filtering method for enhancing digitally processed speech or audio signals comprising the steps of : buffering said speech or audio signals into frames of vectors , each vector having K successive samples (current frame, decoder determines concealment, speech signal) ;
performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute linear predictive coefficients and signal power in the current frame ;
generating a signal probability estimator value sig-prob based on comparison of the signal power in the current frame to a long term estimate of the noise power ;
first filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU6## where 1-P is the LPC coefficient , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors * sig-prob ;
and second filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .

US5966689A
CLAIM 18
. A low bit rate speech communication system for transmitting speech signals comprising : means for buffering said speech signals into frames of vectors , each vector having successive samples ;
means for performing analysis of said buffered frames of speech or audio signals in predetermined blocks to compute encoded speech including linear predictive coefficients and power in the current frame ;
means for transmitting said encoded speech over a transmission channel , a synthesizer coupled to said means for transmitting and responsive to said encoded speech for decoding said speech into digital signals (sound signal, signal energy) ;
a digital to analog converter means responsive to said digital signals from said synthesizer for providing speech signals , said synthesizer comprising means for enhancing digitally processed speech comprising : means for generating a signal probability estimator value sig-prob based on comparison of the power in the current frame to a long term estimate of the noise power ;
first filter means for filtering each vector by a delay controlled by said linear predictive coefficient and said signal probability estimator value , wherein filtering is accomplished by using a transfer function of the form ##EQU7## where 1-P is the LPC coefficients , z is the inverse of the unit delay operator used in the transform representation of the transfer functions , α and β are scaling factors ;
and second filter means for filtering by the transfer function of the form 1-μz -1 * sig-prob , where μ=scaling factor .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5878388A

Filed: 1997-06-09     Issued: 1999-03-02

Voice analysis-synthesis method using noise having diffusion which varies with frequency band to modify predicted phases of transmitted pitch data blocks

(Original Assignee) Sony Corp     (Current Assignee) Sony Corp

Masayuki Nishiguchi, Jun Matsumoto, Shinobu Ono
US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (initial phase) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5878388A
CLAIM 1
. A voice analysis-synthesis method , comprising the steps of : dividing an input voice signal on a block-by-block basis and extracting pitch data from each block ;
converting the voice signal , on the block-by-block basis , into frequency-domain data ;
dividing the frequency-domain data for each of the blocks into plural bands of data on the basis of the pitch data , each of said bands corresponding to a different range of frequencies ;
finding power information for each of the bands of said each of the blocks and voiced/unvoiced decision information for said each of the bands of said each of the blocks ;
transmitting the pitch data , the power information for said each of the bands of said each of the blocks , and the voiced/unvoiced decision information for said each of the bands of said each of the blocks ;
receiving the pitch data , the power information , and the voiced/unvoiced decision information , and predicting a block terminal edge phase for each block of the received pitch data on the basis of said each block of the received pitch data and a block initial phase (speech signal) for said each block of the received pitch data ;
and modifying the predicted block terminal edge phase , using noise having diffusion which varies from band to band for each of the bands .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (initial phase) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5878388A
CLAIM 1
. A voice analysis-synthesis method , comprising the steps of : dividing an input voice signal on a block-by-block basis and extracting pitch data from each block ;
converting the voice signal , on the block-by-block basis , into frequency-domain data ;
dividing the frequency-domain data for each of the blocks into plural bands of data on the basis of the pitch data , each of said bands corresponding to a different range of frequencies ;
finding power information for each of the bands of said each of the blocks and voiced/unvoiced decision information for said each of the bands of said each of the blocks ;
transmitting the pitch data , the power information for said each of the bands of said each of the blocks , and the voiced/unvoiced decision information for said each of the bands of said each of the blocks ;
receiving the pitch data , the power information , and the voiced/unvoiced decision information , and predicting a block terminal edge phase for each block of the received pitch data on the basis of said each block of the received pitch data and a block initial phase (speech signal) for said each block of the received pitch data ;
and modifying the predicted block terminal edge phase , using noise having diffusion which varies from band to band for each of the bands .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (initial phase) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5878388A
CLAIM 1
. A voice analysis-synthesis method , comprising the steps of : dividing an input voice signal on a block-by-block basis and extracting pitch data from each block ;
converting the voice signal , on the block-by-block basis , into frequency-domain data ;
dividing the frequency-domain data for each of the blocks into plural bands of data on the basis of the pitch data , each of said bands corresponding to a different range of frequencies ;
finding power information for each of the bands of said each of the blocks and voiced/unvoiced decision information for said each of the bands of said each of the blocks ;
transmitting the pitch data , the power information for said each of the bands of said each of the blocks , and the voiced/unvoiced decision information for said each of the bands of said each of the blocks ;
receiving the pitch data , the power information , and the voiced/unvoiced decision information , and predicting a block terminal edge phase for each block of the received pitch data on the basis of said each block of the received pitch data and a block initial phase (speech signal) for said each block of the received pitch data ;
and modifying the predicted block terminal edge phase , using noise having diffusion which varies from band to band for each of the bands .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frame) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5878388A
CLAIM 3
. A pitch extraction method for processing an input audio signal comprising frames , each of the frames corresponding to a different time along a time axis , said method comprising the steps of : detecting plural peaks from auto-correlation data of a current frame (current frame, decoder determines concealment) , where the current frame is one of said frames ;
and detecting a pitch of the current frame by determining a position of a maximum peak among the detected plural peaks of the current frame when the maximum peak is equal to or larger than a predetermined threshold , and deciding the pitch of the current frame by determining a position of a peak in a pitch range having a predetermined relation with a pitch found in one of the frames other than said current frame when the maximum peak is smaller than the predetermined threshold .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frame) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5878388A
CLAIM 3
. A pitch extraction method for processing an input audio signal comprising frames , each of the frames corresponding to a different time along a time axis , said method comprising the steps of : detecting plural peaks from auto-correlation data of a current frame (current frame, decoder determines concealment) , where the current frame is one of said frames ;
and detecting a pitch of the current frame by determining a position of a maximum peak among the detected plural peaks of the current frame when the maximum peak is equal to or larger than a predetermined threshold , and deciding the pitch of the current frame by determining a position of a peak in a pitch range having a predetermined relation with a pitch found in one of the frames other than said current frame when the maximum peak is smaller than the predetermined threshold .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (initial phase) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5878388A
CLAIM 1
. A voice analysis-synthesis method , comprising the steps of : dividing an input voice signal on a block-by-block basis and extracting pitch data from each block ;
converting the voice signal , on the block-by-block basis , into frequency-domain data ;
dividing the frequency-domain data for each of the blocks into plural bands of data on the basis of the pitch data , each of said bands corresponding to a different range of frequencies ;
finding power information for each of the bands of said each of the blocks and voiced/unvoiced decision information for said each of the bands of said each of the blocks ;
transmitting the pitch data , the power information for said each of the bands of said each of the blocks , and the voiced/unvoiced decision information for said each of the bands of said each of the blocks ;
receiving the pitch data , the power information , and the voiced/unvoiced decision information , and predicting a block terminal edge phase for each block of the received pitch data on the basis of said each block of the received pitch data and a block initial phase (speech signal) for said each block of the received pitch data ;
and modifying the predicted block terminal edge phase , using noise having diffusion which varies from band to band for each of the bands .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (initial phase) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5878388A
CLAIM 1
. A voice analysis-synthesis method , comprising the steps of : dividing an input voice signal on a block-by-block basis and extracting pitch data from each block ;
converting the voice signal , on the block-by-block basis , into frequency-domain data ;
dividing the frequency-domain data for each of the blocks into plural bands of data on the basis of the pitch data , each of said bands corresponding to a different range of frequencies ;
finding power information for each of the bands of said each of the blocks and voiced/unvoiced decision information for said each of the bands of said each of the blocks ;
transmitting the pitch data , the power information for said each of the bands of said each of the blocks , and the voiced/unvoiced decision information for said each of the bands of said each of the blocks ;
receiving the pitch data , the power information , and the voiced/unvoiced decision information , and predicting a block terminal edge phase for each block of the received pitch data on the basis of said each block of the received pitch data and a block initial phase (speech signal) for said each block of the received pitch data ;
and modifying the predicted block terminal edge phase , using noise having diffusion which varies from band to band for each of the bands .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (initial phase) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5878388A
CLAIM 1
. A voice analysis-synthesis method , comprising the steps of : dividing an input voice signal on a block-by-block basis and extracting pitch data from each block ;
converting the voice signal , on the block-by-block basis , into frequency-domain data ;
dividing the frequency-domain data for each of the blocks into plural bands of data on the basis of the pitch data , each of said bands corresponding to a different range of frequencies ;
finding power information for each of the bands of said each of the blocks and voiced/unvoiced decision information for said each of the bands of said each of the blocks ;
transmitting the pitch data , the power information for said each of the bands of said each of the blocks , and the voiced/unvoiced decision information for said each of the bands of said each of the blocks ;
receiving the pitch data , the power information , and the voiced/unvoiced decision information , and predicting a block terminal edge phase for each block of the received pitch data on the basis of said each block of the received pitch data and a block initial phase (speech signal) for said each block of the received pitch data ;
and modifying the predicted block terminal edge phase , using noise having diffusion which varies from band to band for each of the bands .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frame) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5878388A
CLAIM 3
. A pitch extraction method for processing an input audio signal comprising frames , each of the frames corresponding to a different time along a time axis , said method comprising the steps of : detecting plural peaks from auto-correlation data of a current frame (current frame, decoder determines concealment) , where the current frame is one of said frames ;
and detecting a pitch of the current frame by determining a position of a maximum peak among the detected plural peaks of the current frame when the maximum peak is equal to or larger than a predetermined threshold , and deciding the pitch of the current frame by determining a position of a peak in a pitch range having a predetermined relation with a pitch found in one of the frames other than said current frame when the maximum peak is smaller than the predetermined threshold .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (initial phase) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5878388A
CLAIM 1
. A voice analysis-synthesis method , comprising the steps of : dividing an input voice signal on a block-by-block basis and extracting pitch data from each block ;
converting the voice signal , on the block-by-block basis , into frequency-domain data ;
dividing the frequency-domain data for each of the blocks into plural bands of data on the basis of the pitch data , each of said bands corresponding to a different range of frequencies ;
finding power information for each of the bands of said each of the blocks and voiced/unvoiced decision information for said each of the bands of said each of the blocks ;
transmitting the pitch data , the power information for said each of the bands of said each of the blocks , and the voiced/unvoiced decision information for said each of the bands of said each of the blocks ;
receiving the pitch data , the power information , and the voiced/unvoiced decision information , and predicting a block terminal edge phase for each block of the received pitch data on the basis of said each block of the received pitch data and a block initial phase (speech signal) for said each block of the received pitch data ;
and modifying the predicted block terminal edge phase , using noise having diffusion which varies from band to band for each of the bands .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frame) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5878388A
CLAIM 3
. A pitch extraction method for processing an input audio signal comprising frames , each of the frames corresponding to a different time along a time axis , said method comprising the steps of : detecting plural peaks from auto-correlation data of a current frame (current frame, decoder determines concealment) , where the current frame is one of said frames ;
and detecting a pitch of the current frame by determining a position of a maximum peak among the detected plural peaks of the current frame when the maximum peak is equal to or larger than a predetermined threshold , and deciding the pitch of the current frame by determining a position of a peak in a pitch range having a predetermined relation with a pitch found in one of the frames other than said current frame when the maximum peak is smaller than the predetermined threshold .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5873060A

Filed: 1997-05-27     Issued: 1999-02-16

Signal coder for wide-band signals

(Original Assignee) NEC Corp     (Current Assignee) NEC Corp

Kazunori Ozawa
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (judging unit) from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5873060A
CLAIM 1
. A signal coder comprising : a spectral parameter calculator for obtaining a spectral parameter from an input signal and quantizing the spectral parameter thus obtained ;
a divider for dividing the input signal into a plurality of frequency sub-bands ;
a pitch calculator for obtaining pitch data in at least one of the frequency sub-bands and obtaining a pitch prediction signal ;
a judging unit (average pitch value) for obtaining the pitch prediction signal in at least one of the frequency sub-bands and executing pitch prediction judgment ;
and an excitation quantizer for synthesizing the pitch prediction signal , subtracting the obtained pitch prediction signal from the input signal to obtain an excitation signal , and quantizing the obtained excitation signal .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (zero amplitude) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5873060A
CLAIM 2
. The signal coder according to claim 1 , wherein the excitation signal of the input signal is quantized by expressing it as a plurality of non-zero amplitude (current frame) pulses .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (zero amplitude) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5873060A
CLAIM 2
. The signal coder according to claim 1 , wherein the excitation signal of the input signal is quantized by expressing it as a plurality of non-zero amplitude (current frame) pulses .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (judging unit) from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5873060A
CLAIM 1
. A signal coder comprising : a spectral parameter calculator for obtaining a spectral parameter from an input signal and quantizing the spectral parameter thus obtained ;
a divider for dividing the input signal into a plurality of frequency sub-bands ;
a pitch calculator for obtaining pitch data in at least one of the frequency sub-bands and obtaining a pitch prediction signal ;
a judging unit (average pitch value) for obtaining the pitch prediction signal in at least one of the frequency sub-bands and executing pitch prediction judgment ;
and an excitation quantizer for synthesizing the pitch prediction signal , subtracting the obtained pitch prediction signal from the input signal to obtain an excitation signal , and quantizing the obtained excitation signal .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (zero amplitude) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5873060A
CLAIM 2
. The signal coder according to claim 1 , wherein the excitation signal of the input signal is quantized by expressing it as a plurality of non-zero amplitude (current frame) pulses .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (zero amplitude) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5873060A
CLAIM 2
. The signal coder according to claim 1 , wherein the excitation signal of the input signal is quantized by expressing it as a plurality of non-zero amplitude (current frame) pulses .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6009122A

Filed: 1997-05-12     Issued: 1999-12-28

Method and apparatus for superframe bit allocation

(Original Assignee) Amati Communications Corp     (Current Assignee) Texas Instruments Inc

Jacky S. Chow
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy (digital signals) for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (digital signals) is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (digital signals) is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (digital frequency) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency (LP filter, LP filter excitation signal) domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (digital frequency) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (analog signal) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6009122A
CLAIM 1
. A transmitter for a data transmission system using multicarrier modulation , said transmitter comprising : a superframe bit allocation table , said superframe bit allocation tables stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
a data symbol encoder , said data symbol encoder receives digital data to be transmitted and encodes bits associated with the digital data to frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation table ;
a multicarrier modulation unit , said multicarrier modulation unit modulates the encoded bits on the frequency tones of a frame to produce modulated signals ;
and a digital-to-analog converter , said digital-to-analog converter converts the modulated signals to analog signal (current frame) s , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency (LP filter, LP filter excitation signal) domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (digital signals) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (digital frequency) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (analog signal) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6009122A
CLAIM 1
. A transmitter for a data transmission system using multicarrier modulation , said transmitter comprising : a superframe bit allocation table , said superframe bit allocation tables stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
a data symbol encoder , said data symbol encoder receives digital data to be transmitted and encodes bits associated with the digital data to frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation table ;
a multicarrier modulation unit , said multicarrier modulation unit modulates the encoded bits on the frequency tones of a frame to produce modulated signals ;
and a digital-to-analog converter , said digital-to-analog converter converts the modulated signals to analog signal (current frame) s , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency (LP filter, LP filter excitation signal) domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (digital signals) for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (digital signals) is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (digital signals) is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (digital frequency) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency (LP filter, LP filter excitation signal) domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (digital frequency) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (analog signal) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6009122A
CLAIM 1
. A transmitter for a data transmission system using multicarrier modulation , said transmitter comprising : a superframe bit allocation table , said superframe bit allocation tables stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
a data symbol encoder , said data symbol encoder receives digital data to be transmitted and encodes bits associated with the digital data to frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation table ;
a multicarrier modulation unit , said multicarrier modulation unit modulates the encoded bits on the frequency tones of a frame to produce modulated signals ;
and a digital-to-analog converter , said digital-to-analog converter converts the modulated signals to analog signal (current frame) s , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency (LP filter, LP filter excitation signal) domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (digital signals) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (digital signals) for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (digital signals) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (digital frequency) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (analog signal) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6009122A
CLAIM 1
. A transmitter for a data transmission system using multicarrier modulation , said transmitter comprising : a superframe bit allocation table , said superframe bit allocation tables stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
a data symbol encoder , said data symbol encoder receives digital data to be transmitted and encodes bits associated with the digital data to frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation table ;
a multicarrier modulation unit , said multicarrier modulation unit modulates the encoded bits on the frequency tones of a frame to produce modulated signals ;
and a digital-to-analog converter , said digital-to-analog converter converts the modulated signals to analog signal (current frame) s , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .

US6009122A
CLAIM 8
. An apparatus for recovering data transmitted by a transmitter , said apparatus comprising : an analog-to-digital converter , said analog-to-digital converter receives transmitted analog signals and produces digital signals (sound signal, signal energy) therefrom , the transmitted analog signals being time domain signals representing data transmitted ;
a demodulator , said demodulator receives the digital signals and demodulates the digital signals to produce digital frequency (LP filter, LP filter excitation signal) domain data ;
a superframe bit allocation table , said superframe bit allocation table stores superframe bit allocation information including separate bit allocation information for a plurality of frames of a superframe ;
and a data symbol decoder , said data symbol decoder operates to decode bits associated with the digital frequency domain data from frequency tones of a frame based on the superframe bit allocation information associated with the frame stored in said superframe bit allocation tables , wherein the superframe includes a plurality of frames , with one or more of the frames being capable of carrying data in a first direction and zero or more of the frames being capable of carrying data in a second direction .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5953697A

Filed: 1997-05-05     Issued: 1999-09-14

Gain estimation scheme for LPC vocoders with a shape index based on signal envelopes

(Original Assignee) Holtek Semiconductor Inc     (Current Assignee) Holtek Semiconductor Inc

Chin-Teng Lin, Hsin-An Lin
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse response) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response) (codebook approach) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5953697A
CLAIM 4
. The method of claim 1 , wherein said shape index and quantized gain are obtained by a predetermined codebook approach (placing remaining impulse responses) of 16 different shape codewords with 4 bits .

US5953697A
CLAIM 5
. The method of claim 1 , wherein said gain of voiced subframes is obtained by the steps of : (a) calculating an unit pulse response of said synthesis filter at the current pulse position ;
(b) calculating said gain of said current pulse by : ##EQU6## wherein α k is the k th pulse gain ;
Env k , i is the decoded envelope for the k th pulse at the position I ;
imp -- res k , i is the impulse response (impulse responses, impulse response, LP filter) ;
P O is the pulse position ;
and r is the search length (c) feeding said current pulse into said synthesis filter after said gain of said current pulse is obtained ;
(d) multiplying said current pulse and said α k to produce a synthesized speech output ;
and (e) repeating steps (a) through (d) for next pulse .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5953697A
CLAIM 5
. The method of claim 1 , wherein said gain of voiced subframes is obtained by the steps of : (a) calculating an unit pulse response of said synthesis filter at the current pulse position ;
(b) calculating said gain of said current pulse by : ##EQU6## wherein α k is the k th pulse gain ;
Env k , i is the decoded envelope for the k th pulse at the position I ;
imp -- res k , i is the impulse response (impulse responses, impulse response, LP filter) ;
P O is the pulse position ;
and r is the search length (c) feeding said current pulse into said synthesis filter after said gain of said current pulse is obtained ;
(d) multiplying said current pulse and said α k to produce a synthesized speech output ;
and (e) repeating steps (a) through (d) for next pulse .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (impulse response) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse response) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5953697A
CLAIM 5
. The method of claim 1 , wherein said gain of voiced subframes is obtained by the steps of : (a) calculating an unit pulse response of said synthesis filter at the current pulse position ;
(b) calculating said gain of said current pulse by : ##EQU6## wherein α k is the k th pulse gain ;
Env k , i is the decoded envelope for the k th pulse at the position I ;
imp -- res k , i is the impulse response (impulse responses, impulse response, LP filter) ;
P O is the pulse position ;
and r is the search length (c) feeding said current pulse into said synthesis filter after said gain of said current pulse is obtained ;
(d) multiplying said current pulse and said α k to produce a synthesized speech output ;
and (e) repeating steps (a) through (d) for next pulse .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse response) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5953697A
CLAIM 5
. The method of claim 1 , wherein said gain of voiced subframes is obtained by the steps of : (a) calculating an unit pulse response of said synthesis filter at the current pulse position ;
(b) calculating said gain of said current pulse by : ##EQU6## wherein α k is the k th pulse gain ;
Env k , i is the decoded envelope for the k th pulse at the position I ;
imp -- res k , i is the impulse response (impulse responses, impulse response, LP filter) ;
P O is the pulse position ;
and r is the search length (c) feeding said current pulse into said synthesis filter after said gain of said current pulse is obtained ;
(d) multiplying said current pulse and said α k to produce a synthesized speech output ;
and (e) repeating steps (a) through (d) for next pulse .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs (encoded parameter) , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse response) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response) (codebook approach) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5953697A
CLAIM 1
. A method for synthesizing speech based on encoded parameter (decoder constructs) s , comprising : (a) receiving pitch data , a set of filter coefficients , a shape index and a quantized gain that produces an envelope , and a voice/unvoiced parameter for a series of frames that are continuous in time ;
(b) selecting a periodic impulse train or white noise based on the voiced/unvoiced parameter ;
(c) providing the selected a periodic impulse train or white noise to a synthesis filter ;
(d) providing the filter coefficients to the synthesis filter ;
(e) determining a gain function based on the envelope and the output of the synthesis filter , the gain function calculated such that the maximum output of the synthesis filter excited by an input of the product of a unit impulse function and the gain approximates the envelope ;
and (f) multiplying the gain function and the output of the synthesis filter to produce a synthesized speech output .

US5953697A
CLAIM 4
. The method of claim 1 , wherein said shape index and quantized gain are obtained by a predetermined codebook approach (placing remaining impulse responses) of 16 different shape codewords with 4 bits .

US5953697A
CLAIM 5
. The method of claim 1 , wherein said gain of voiced subframes is obtained by the steps of : (a) calculating an unit pulse response of said synthesis filter at the current pulse position ;
(b) calculating said gain of said current pulse by : ##EQU6## wherein α k is the k th pulse gain ;
Env k , i is the decoded envelope for the k th pulse at the position I ;
imp -- res k , i is the impulse response (impulse responses, impulse response, LP filter) ;
P O is the pulse position ;
and r is the search length (c) feeding said current pulse into said synthesis filter after said gain of said current pulse is obtained ;
(d) multiplying said current pulse and said α k to produce a synthesized speech output ;
and (e) repeating steps (a) through (d) for next pulse .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5953697A
CLAIM 5
. The method of claim 1 , wherein said gain of voiced subframes is obtained by the steps of : (a) calculating an unit pulse response of said synthesis filter at the current pulse position ;
(b) calculating said gain of said current pulse by : ##EQU6## wherein α k is the k th pulse gain ;
Env k , i is the decoded envelope for the k th pulse at the position I ;
imp -- res k , i is the impulse response (impulse responses, impulse response, LP filter) ;
P O is the pulse position ;
and r is the search length (c) feeding said current pulse into said synthesis filter after said gain of said current pulse is obtained ;
(d) multiplying said current pulse and said α k to produce a synthesized speech output ;
and (e) repeating steps (a) through (d) for next pulse .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (impulse response) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse response) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5953697A
CLAIM 5
. The method of claim 1 , wherein said gain of voiced subframes is obtained by the steps of : (a) calculating an unit pulse response of said synthesis filter at the current pulse position ;
(b) calculating said gain of said current pulse by : ##EQU6## wherein α k is the k th pulse gain ;
Env k , i is the decoded envelope for the k th pulse at the position I ;
imp -- res k , i is the impulse response (impulse responses, impulse response, LP filter) ;
P O is the pulse position ;
and r is the search length (c) feeding said current pulse into said synthesis filter after said gain of said current pulse is obtained ;
(d) multiplying said current pulse and said α k to produce a synthesized speech output ;
and (e) repeating steps (a) through (d) for next pulse .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (previous frames) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse response) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5953697A
CLAIM 3
. The method of claim 2 , wherein the interpolating LPC coefficients in a line spectrum pair (LSP) domain is achieved by dividing each speech frame into four subframes , and the LSP coefficient used in each subframe is obtained by linear interpolation of the LSP coefficients between the current and previous frames (frame concealment) , the interpolated LSP coefficients then being converted to LPC coefficients .

US5953697A
CLAIM 5
. The method of claim 1 , wherein said gain of voiced subframes is obtained by the steps of : (a) calculating an unit pulse response of said synthesis filter at the current pulse position ;
(b) calculating said gain of said current pulse by : ##EQU6## wherein α k is the k th pulse gain ;
Env k , i is the decoded envelope for the k th pulse at the position I ;
imp -- res k , i is the impulse response (impulse responses, impulse response, LP filter) ;
P O is the pulse position ;
and r is the search length (c) feeding said current pulse into said synthesis filter after said gain of said current pulse is obtained ;
(d) multiplying said current pulse and said α k to produce a synthesized speech output ;
and (e) repeating steps (a) through (d) for next pulse .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6023672A

Filed: 1997-04-16     Issued: 2000-02-08

Speech coder

(Original Assignee) NEC Corp     (Current Assignee) NEC Corp

Kazunori Ozawa
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (judging unit) from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6023672A
CLAIM 5
. A speech coder , comprising : a spectral parameter calculator obtaining a spectral parameter from an input speech signal for every determined period of time and quantizing the spectral parameter ;
a mode judging unit (average pitch value) judging a mode by extracting a feature quantity from the speech signal ;
a divider dividing M non-zero amplitude pulses of an excitation signal into groups of fewer than M pulses ;
and an excitation quantizer calculating a plurality of sets of positions of the pulses in each group and , simultaneously , quantizing the amplitudes of the pulses in each group using a codebook and the spectral parameter , and selecting at least one quantization candidate by evaluating the distortion through addition of the evaluation value based on an adjacent group quantization candidate output value and the evaluation value based on the pertinent group quantization value , thereby selecting a combination of position set and a codevector for quantizing the speech signals .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal, speech coder) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6023672A
CLAIM 1
. A speech coder (speech signal, decoder determines concealment) , comprising : a spectral parameter calculator obtaining a spectral parameter from an input speech signal (speech signal, decoder determines concealment) and quantizing the spectral parameter ;
a divider diving M non-zero amplitude pulses of an excitation signal of the speech signal into groups , each of said groups of pulses having a number of pulses fewer than M ;
and an excitation quantizer calculating the positions of the pulses in each of said groups and simultaneously quantizing the amplitudes of the pulses using the spectral parameter , selecting and outputting at least one quantization candidate by evaluating distortion through addition of ;
(1) the evaluation value based on an adjacent group quantization candidate output value , and (2) the evaluation value based on the pertinent group quantization value , and encoding the speech signal using the selected quantization candidate .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal, speech coder) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6023672A
CLAIM 1
. A speech coder (speech signal, decoder determines concealment) , comprising : a spectral parameter calculator obtaining a spectral parameter from an input speech signal (speech signal, decoder determines concealment) and quantizing the spectral parameter ;
a divider diving M non-zero amplitude pulses of an excitation signal of the speech signal into groups , each of said groups of pulses having a number of pulses fewer than M ;
and an excitation quantizer calculating the positions of the pulses in each of said groups and simultaneously quantizing the amplitudes of the pulses using the spectral parameter , selecting and outputting at least one quantization candidate by evaluating distortion through addition of ;
(1) the evaluation value based on an adjacent group quantization candidate output value , and (2) the evaluation value based on the pertinent group quantization value , and encoding the speech signal using the selected quantization candidate .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal, speech coder) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6023672A
CLAIM 1
. A speech coder (speech signal, decoder determines concealment) , comprising : a spectral parameter calculator obtaining a spectral parameter from an input speech signal (speech signal, decoder determines concealment) and quantizing the spectral parameter ;
a divider diving M non-zero amplitude pulses of an excitation signal of the speech signal into groups , each of said groups of pulses having a number of pulses fewer than M ;
and an excitation quantizer calculating the positions of the pulses in each of said groups and simultaneously quantizing the amplitudes of the pulses using the spectral parameter , selecting and outputting at least one quantization candidate by evaluating distortion through addition of ;
(1) the evaluation value based on an adjacent group quantization candidate output value , and (2) the evaluation value based on the pertinent group quantization value , and encoding the speech signal using the selected quantization candidate .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (zero amplitude) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6023672A
CLAIM 1
. A speech coder , comprising : a spectral parameter calculator obtaining a spectral parameter from an input speech signal and quantizing the spectral parameter ;
a divider diving M non-zero amplitude (current frame) pulses of an excitation signal of the speech signal into groups , each of said groups of pulses having a number of pulses fewer than M ;
and an excitation quantizer calculating the positions of the pulses in each of said groups and simultaneously quantizing the amplitudes of the pulses using the spectral parameter , selecting and outputting at least one quantization candidate by evaluating distortion through addition of ;
(1) the evaluation value based on an adjacent group quantization candidate output value , and (2) the evaluation value based on the pertinent group quantization value , and encoding the speech signal using the selected quantization candidate .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (zero amplitude) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6023672A
CLAIM 1
. A speech coder , comprising : a spectral parameter calculator obtaining a spectral parameter from an input speech signal and quantizing the spectral parameter ;
a divider diving M non-zero amplitude (current frame) pulses of an excitation signal of the speech signal into groups , each of said groups of pulses having a number of pulses fewer than M ;
and an excitation quantizer calculating the positions of the pulses in each of said groups and simultaneously quantizing the amplitudes of the pulses using the spectral parameter , selecting and outputting at least one quantization candidate by evaluating distortion through addition of ;
(1) the evaluation value based on an adjacent group quantization candidate output value , and (2) the evaluation value based on the pertinent group quantization value , and encoding the speech signal using the selected quantization candidate .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (judging unit) from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6023672A
CLAIM 5
. A speech coder , comprising : a spectral parameter calculator obtaining a spectral parameter from an input speech signal for every determined period of time and quantizing the spectral parameter ;
a mode judging unit (average pitch value) judging a mode by extracting a feature quantity from the speech signal ;
a divider dividing M non-zero amplitude pulses of an excitation signal into groups of fewer than M pulses ;
and an excitation quantizer calculating a plurality of sets of positions of the pulses in each group and , simultaneously , quantizing the amplitudes of the pulses in each group using a codebook and the spectral parameter , and selecting at least one quantization candidate by evaluating the distortion through addition of the evaluation value based on an adjacent group quantization candidate output value and the evaluation value based on the pertinent group quantization value , thereby selecting a combination of position set and a codevector for quantizing the speech signals .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal, speech coder) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6023672A
CLAIM 1
. A speech coder (speech signal, decoder determines concealment) , comprising : a spectral parameter calculator obtaining a spectral parameter from an input speech signal (speech signal, decoder determines concealment) and quantizing the spectral parameter ;
a divider diving M non-zero amplitude pulses of an excitation signal of the speech signal into groups , each of said groups of pulses having a number of pulses fewer than M ;
and an excitation quantizer calculating the positions of the pulses in each of said groups and simultaneously quantizing the amplitudes of the pulses using the spectral parameter , selecting and outputting at least one quantization candidate by evaluating distortion through addition of ;
(1) the evaluation value based on an adjacent group quantization candidate output value , and (2) the evaluation value based on the pertinent group quantization value , and encoding the speech signal using the selected quantization candidate .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal, speech coder) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6023672A
CLAIM 1
. A speech coder (speech signal, decoder determines concealment) , comprising : a spectral parameter calculator obtaining a spectral parameter from an input speech signal (speech signal, decoder determines concealment) and quantizing the spectral parameter ;
a divider diving M non-zero amplitude pulses of an excitation signal of the speech signal into groups , each of said groups of pulses having a number of pulses fewer than M ;
and an excitation quantizer calculating the positions of the pulses in each of said groups and simultaneously quantizing the amplitudes of the pulses using the spectral parameter , selecting and outputting at least one quantization candidate by evaluating distortion through addition of ;
(1) the evaluation value based on an adjacent group quantization candidate output value , and (2) the evaluation value based on the pertinent group quantization value , and encoding the speech signal using the selected quantization candidate .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal, speech coder) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6023672A
CLAIM 1
. A speech coder (speech signal, decoder determines concealment) , comprising : a spectral parameter calculator obtaining a spectral parameter from an input speech signal (speech signal, decoder determines concealment) and quantizing the spectral parameter ;
a divider diving M non-zero amplitude pulses of an excitation signal of the speech signal into groups , each of said groups of pulses having a number of pulses fewer than M ;
and an excitation quantizer calculating the positions of the pulses in each of said groups and simultaneously quantizing the amplitudes of the pulses using the spectral parameter , selecting and outputting at least one quantization candidate by evaluating distortion through addition of ;
(1) the evaluation value based on an adjacent group quantization candidate output value , and (2) the evaluation value based on the pertinent group quantization value , and encoding the speech signal using the selected quantization candidate .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (zero amplitude) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6023672A
CLAIM 1
. A speech coder , comprising : a spectral parameter calculator obtaining a spectral parameter from an input speech signal and quantizing the spectral parameter ;
a divider diving M non-zero amplitude (current frame) pulses of an excitation signal of the speech signal into groups , each of said groups of pulses having a number of pulses fewer than M ;
and an excitation quantizer calculating the positions of the pulses in each of said groups and simultaneously quantizing the amplitudes of the pulses using the spectral parameter , selecting and outputting at least one quantization candidate by evaluating distortion through addition of ;
(1) the evaluation value based on an adjacent group quantization candidate output value , and (2) the evaluation value based on the pertinent group quantization value , and encoding the speech signal using the selected quantization candidate .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal, speech coder) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6023672A
CLAIM 1
. A speech coder (speech signal, decoder determines concealment) , comprising : a spectral parameter calculator obtaining a spectral parameter from an input speech signal (speech signal, decoder determines concealment) and quantizing the spectral parameter ;
a divider diving M non-zero amplitude pulses of an excitation signal of the speech signal into groups , each of said groups of pulses having a number of pulses fewer than M ;
and an excitation quantizer calculating the positions of the pulses in each of said groups and simultaneously quantizing the amplitudes of the pulses using the spectral parameter , selecting and outputting at least one quantization candidate by evaluating distortion through addition of ;
(1) the evaluation value based on an adjacent group quantization candidate output value , and (2) the evaluation value based on the pertinent group quantization value , and encoding the speech signal using the selected quantization candidate .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (zero amplitude) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6023672A
CLAIM 1
. A speech coder , comprising : a spectral parameter calculator obtaining a spectral parameter from an input speech signal and quantizing the spectral parameter ;
a divider diving M non-zero amplitude (current frame) pulses of an excitation signal of the speech signal into groups , each of said groups of pulses having a number of pulses fewer than M ;
and an excitation quantizer calculating the positions of the pulses in each of said groups and simultaneously quantizing the amplitudes of the pulses using the spectral parameter , selecting and outputting at least one quantization candidate by evaluating distortion through addition of ;
(1) the evaluation value based on an adjacent group quantization candidate output value , and (2) the evaluation value based on the pertinent group quantization value , and encoding the speech signal using the selected quantization candidate .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH10282997A

Filed: 1997-04-04     Issued: 1998-10-23

音声符号化装置及び復号装置

(Original Assignee) Nec Corp; 日本電気株式会社     

Toshiyuki Nomura, 俊之 野村
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有すること (maximum amplitude) を特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声信号, 音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有すること (maximum amplitude) を特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号, 音声復号) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有すること (maximum amplitude) を特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (音声信号, 音声復号) is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有すること (maximum amplitude) を特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号, 音声復号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声信号, 音声復号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号, 音声復号) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JPH10282997A
CLAIM 1
【請求項1】音声信号 (sound signal, speech signal) の励振信号を複数のパルスから成 るマルチパルス信号で表現し、前記励起信号により線形 予測合成フィルタを励振して得られる再生音声信号と入 力音声信号との間の歪みを最小化するように前記励起信 号を多段符号化する音声符号化装置であって、前記マル チパルス信号の多段符号化において、各段の符号化で、 その前段までに符号化したパルスの位置よりも、まだパ ルスが配置されていないパルス位置を優先したパルス位 置設定を行なうマルチパルス設定回路を有することを特 徴とする音声符号化装置。

JPH10282997A
CLAIM 2
【請求項2】多段符号化されたデータから複数のパルス で表現された励振信号を、符号化されたデータから線形 予測合成フィルタ係数を、復号し、前記励起信号により 前記線形予測合成フィルタを励振して再生音声信号を再 生する音声復号 (sound signal, speech signal) 装置であって、前記多段符号化されたデ ータから複数のパルスで表現された励振信号を復号する 際に、各段の復号で、その前段までに復号されたパルス の位置よりも、まだパルスが配置されていないパルス位 置を優先したパルス位置設定を行なうマルチパルス設定 回路を有することを特徴とする音声復号装置。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5907822A

Filed: 1997-04-04     Issued: 1999-05-25

Loss tolerant speech decoder for telecommunications

(Original Assignee) Lincom Corp     (Current Assignee) Engility LLC

Jaime L. Prieto, Jr.
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (energy characteristics) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response (impulse response) up to the end of a last subframe affected by the artificial construction of the periodic part .
US5907822A
CLAIM 4
. A speech decoder as in claim 3 wherein said neural networks are finite-impulse response (impulse responses, impulse response, LP filter) multi-layer feed-forward neural networks .

US5907822A
CLAIM 7
. A speech decoder as in claim 2 wherein at least one neural network is designated for the energy characteristics (first impulse) of said speech frame parameters .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5907822A
CLAIM 15
. A speech decoder as in claim 1 wherein a speech compression algorithm synthesizer receives decoded parameters from said parameter decoder and transforms said decoded parameters into speech signal (speech signal, decoder determines concealment) voltages that are then output to a listener .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5907822A
CLAIM 15
. A speech decoder as in claim 1 wherein a speech compression algorithm synthesizer receives decoded parameters from said parameter decoder and transforms said decoded parameters into speech signal (speech signal, decoder determines concealment) voltages that are then output to a listener .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5907822A
CLAIM 15
. A speech decoder as in claim 1 wherein a speech compression algorithm synthesizer receives decoded parameters from said parameter decoder and transforms said decoded parameters into speech signal (speech signal, decoder determines concealment) voltages that are then output to a listener .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5907822A
CLAIM 4
. A speech decoder as in claim 3 wherein said neural networks are finite-impulse response (impulse responses, impulse response, LP filter) multi-layer feed-forward neural networks .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (impulse response) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse response) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5907822A
CLAIM 4
. A speech decoder as in claim 3 wherein said neural networks are finite-impulse response (impulse responses, impulse response, LP filter) multi-layer feed-forward neural networks .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse response) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5907822A
CLAIM 4
. A speech decoder as in claim 3 wherein said neural networks are finite-impulse response (impulse responses, impulse response, LP filter) multi-layer feed-forward neural networks .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (energy characteristics) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response (impulse response) up to an end of a last subframe affected by the artificial construction of the periodic part .
US5907822A
CLAIM 4
. A speech decoder as in claim 3 wherein said neural networks are finite-impulse response (impulse responses, impulse response, LP filter) multi-layer feed-forward neural networks .

US5907822A
CLAIM 7
. A speech decoder as in claim 2 wherein at least one neural network is designated for the energy characteristics (first impulse) of said speech frame parameters .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5907822A
CLAIM 15
. A speech decoder as in claim 1 wherein a speech compression algorithm synthesizer receives decoded parameters from said parameter decoder and transforms said decoded parameters into speech signal (speech signal, decoder determines concealment) voltages that are then output to a listener .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5907822A
CLAIM 15
. A speech decoder as in claim 1 wherein a speech compression algorithm synthesizer receives decoded parameters from said parameter decoder and transforms said decoded parameters into speech signal (speech signal, decoder determines concealment) voltages that are then output to a listener .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5907822A
CLAIM 15
. A speech decoder as in claim 1 wherein a speech compression algorithm synthesizer receives decoded parameters from said parameter decoder and transforms said decoded parameters into speech signal (speech signal, decoder determines concealment) voltages that are then output to a listener .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5907822A
CLAIM 4
. A speech decoder as in claim 3 wherein said neural networks are finite-impulse response (impulse responses, impulse response, LP filter) multi-layer feed-forward neural networks .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (impulse response) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse response) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5907822A
CLAIM 4
. A speech decoder as in claim 3 wherein said neural networks are finite-impulse response (impulse responses, impulse response, LP filter) multi-layer feed-forward neural networks .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5907822A
CLAIM 15
. A speech decoder as in claim 1 wherein a speech compression algorithm synthesizer receives decoded parameters from said parameter decoder and transforms said decoded parameters into speech signal (speech signal, decoder determines concealment) voltages that are then output to a listener .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (impulse response) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse response) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5907822A
CLAIM 4
. A speech decoder as in claim 3 wherein said neural networks are finite-impulse response (impulse responses, impulse response, LP filter) multi-layer feed-forward neural networks .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6122607A

Filed: 1997-03-25     Issued: 2000-09-19

Method and arrangement for reconstruction of a received speech signal

(Original Assignee) Telefonaktiebolaget LM Ericsson AB     (Current Assignee) Telefonaktiebolaget LM Ericsson AB

Erik Ekudden, Daniel Brighenti
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response (inverse filter) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US6122607A
CLAIM 28
. An arrangement according to claim 27 , wherein the first digital filter is an inverse filter (impulse response) ;
and the second digital filter is a synthesis filter .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US6122607A
CLAIM 30
. An arrangement according to claim 29 , wherein said signal modelling unit includes an excitation generating unit that functions to generate an estimated signal that is based on three of said linear predictive signal mode parameters and a second summation signal , and includes a state machine that functions to generate control signal (maximum amplitude) s that are based on said quality parameter and on one of said linear predictive signal mode parameters .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (predetermined quality, speech signal, bad frame) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy (predetermined quality, speech signal, bad frame) for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (first sum) per sample for other frames .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US6122607A
CLAIM 22
. An arrangement according to claim 21 , wherein the signal combining unit functions to form a first weighted value of said received signal by multiplying said received signal with said first weighting factor in a first multiplier unit , and to form a second weighted value of said estimated signal by multiplying said estimated signal with said second weighting factor in a second multiplier unit , wherein the first and the second weighted values according to said ratio , are combined in a first sum (average energy) mation , and wherein said reconstructed signal is formed as a first summation signal .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (predetermined quality, speech signal, bad frame) is a speech signal (predetermined quality, speech signal, bad frame) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (predetermined quality, speech signal, bad frame) is a speech signal (predetermined quality, speech signal, bad frame) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number (last non) of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US6122607A
CLAIM 35
. An arrangement according to claim 34 , wherein said memory buffer functions to generate , on the basis of two of said linear predictive signal model parameters , a first signal that represents a (LP filter excitation signal) voice speech sound .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (inverse filter) of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number (last non) of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality value .

US6122607A
CLAIM 28
. An arrangement according to claim 27 , wherein the first digital filter is an inverse filter (impulse response) ;
and the second digital filter is a synthesis filter .

US6122607A
CLAIM 35
. An arrangement according to claim 34 , wherein said memory buffer functions to generate , on the basis of two of said linear predictive signal model parameters , a first signal that represents a (LP filter excitation signal) voice speech sound .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US6122607A
CLAIM 30
. An arrangement according to claim 29 , wherein said signal modelling unit includes an excitation generating unit that functions to generate an estimated signal that is based on three of said linear predictive signal mode parameters and a second summation signal , and includes a state machine that functions to generate control signal (maximum amplitude) s that are based on said quality parameter and on one of said linear predictive signal mode parameters .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (predetermined quality, speech signal, bad frame) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (inverse filter) of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number (last non) of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US6122607A
CLAIM 28
. An arrangement according to claim 27 , wherein the first digital filter is an inverse filter (impulse response) ;
and the second digital filter is a synthesis filter .

US6122607A
CLAIM 35
. An arrangement according to claim 34 , wherein said memory buffer functions to generate , on the basis of two of said linear predictive signal model parameters , a first signal that represents a (LP filter excitation signal) voice speech sound .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response (inverse filter) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US6122607A
CLAIM 28
. An arrangement according to claim 27 , wherein the first digital filter is an inverse filter (impulse response) ;
and the second digital filter is a synthesis filter .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US6122607A
CLAIM 30
. An arrangement according to claim 29 , wherein said signal modelling unit includes an excitation generating unit that functions to generate an estimated signal that is based on three of said linear predictive signal mode parameters and a second summation signal , and includes a state machine that functions to generate control signal (maximum amplitude) s that are based on said quality parameter and on one of said linear predictive signal mode parameters .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (predetermined quality, speech signal, bad frame) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (predetermined quality, speech signal, bad frame) for frames classified as voiced or onset , and in relation to an average energy (first sum) per sample for other frames .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US6122607A
CLAIM 22
. An arrangement according to claim 21 , wherein the signal combining unit functions to form a first weighted value of said received signal by multiplying said received signal with said first weighting factor in a first multiplier unit , and to form a second weighted value of said estimated signal by multiplying said estimated signal with said second weighting factor in a second multiplier unit , wherein the first and the second weighted values according to said ratio , are combined in a first sum (average energy) mation , and wherein said reconstructed signal is formed as a first summation signal .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (predetermined quality, speech signal, bad frame) is a speech signal (predetermined quality, speech signal, bad frame) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (predetermined quality, speech signal, bad frame) is a speech signal (predetermined quality, speech signal, bad frame) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number (last non) of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US6122607A
CLAIM 35
. An arrangement according to claim 34 , wherein said memory buffer functions to generate , on the basis of two of said linear predictive signal model parameters , a first signal that represents a (LP filter excitation signal) voice speech sound .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (inverse filter) of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number (last non) of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality value .

US6122607A
CLAIM 28
. An arrangement according to claim 27 , wherein the first digital filter is an inverse filter (impulse response) ;
and the second digital filter is a synthesis filter .

US6122607A
CLAIM 35
. An arrangement according to claim 34 , wherein said memory buffer functions to generate , on the basis of two of said linear predictive signal model parameters , a first signal that represents a (LP filter excitation signal) voice speech sound .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US6122607A
CLAIM 30
. An arrangement according to claim 29 , wherein said signal modelling unit includes an excitation generating unit that functions to generate an estimated signal that is based on three of said linear predictive signal mode parameters and a second summation signal , and includes a state machine that functions to generate control signal (maximum amplitude) s that are based on said quality parameter and on one of said linear predictive signal mode parameters .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (predetermined quality, speech signal, bad frame) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (predetermined quality, speech signal, bad frame) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (predetermined quality, speech signal, bad frame) for frames classified as voiced or onset , and in relation to an average energy (first sum) per sample for other frames .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US6122607A
CLAIM 22
. An arrangement according to claim 21 , wherein the signal combining unit functions to form a first weighted value of said received signal by multiplying said received signal with said first weighting factor in a first multiplier unit , and to form a second weighted value of said estimated signal by multiplying said estimated signal with said second weighting factor in a second multiplier unit , wherein the first and the second weighted values according to said ratio , are combined in a first sum (average energy) mation , and wherein said reconstructed signal is formed as a first summation signal .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (predetermined quality, speech signal, bad frame) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (predetermined quality, speech signal, bad frame) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (error rate) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (inverse filter) of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6122607A
CLAIM 1
. A method of reconstructing a speech signal (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) from a received signal (r) , characterized by creating through a signal model (500) an estimated signal (p) that corresponds to anticipated future values of the received signal (r) ;
generating a quality parameter (q) based on quality characteristics of said received signal (r) ;
combining said received signal (r) and said estimated signal (ρ) and forming a reconstructed speech signal (r rec) , wherein said quality parameter (q) determines weighting factors (α , β) based upon which said respective received signal (r) and said estimated signal (ρ) are combined .

US6122607A
CLAIM 4
. A method according to claim 1 , wherein said quality parameter is based on a bit error rate (frame concealment) that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 5
. A method according to claim 1 , wherein said quality parameter is based on a bad frame (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) indicator that has been calculated from a digital representation of said received signal .

US6122607A
CLAIM 17
. A method according to claim 1 , wherein a transition from solely said received signal to solely said estimated signal takes place during a transition period of at least a predetermined number (last non) of consecutive samples of said received signal during which the quality parameter for said received signal is below a predetermined quality (sound signal, signal classification parameter, speech signal, signal energy, decoder determines concealment, decoder concealment, determining concealment) value .

US6122607A
CLAIM 28
. An arrangement according to claim 27 , wherein the first digital filter is an inverse filter (impulse response) ;
and the second digital filter is a synthesis filter .

US6122607A
CLAIM 35
. An arrangement according to claim 34 , wherein said memory buffer functions to generate , on the basis of two of said linear predictive signal model parameters , a first signal that represents a (LP filter excitation signal) voice speech sound .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH10260698A

Filed: 1997-03-21     Issued: 1998-09-29

信号符号化装置

(Original Assignee) Nec Corp; 日本電気株式会社     

Kazunori Ozawa, 一範 小澤
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (最適化) and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JPH10260698A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を符号化する信号符号化装置に おいて、 音声信号からスペクトルパラメータおよびピッチパラメ ータを求めて量子化するパラメータ計算手段と、 これら量子化されたスペクトルパラメータまたはピッチ パラメータのうち、少なくとも一つによって構成される フィルタにより、そのインパルス応答を算出するインパ ルス応答計算手段と、 量子化されたスペクトルパラメータおよびピッチパラメ ータに基づいて、音声信号または音声信号に由来する信 号の直交変換をして第1変換信号を求める第1直交変換 手段と、 算出されたインパルス応答またはインパルス応答に由来 する信号の直交変換をして第2変換信号を求める第2直 交変換手段と、 第1変換信号の一部分または全部、および第2変換信号 を量子化することによって複数個のパルスを求めるパル ス量子化手段とを備えることを特徴とする信号符号化装 置。

JPH10260698A
CLAIM 2
【請求項2】 前記パルス量子化手段には、前記複数個 のパルスをピッチパラメータに基づいて繰り返しながら 第1パルス群を探索する第1探索部と、第2変換信号に 基づいて第2パルス群を探索する第2探索部とを有して おり、 第1パルス群および第2パルス群のうちから第1変換信 号を最適化 (energy information parameter) するものを選択する選択回路を更に備える請 求項1記載の信号符号化装置。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6170073B1

Filed: 1997-03-21     Issued: 2001-01-02

Method and apparatus for error detection in digital communications

(Original Assignee) Nokia Mobile Phones UK Ltd     (Current Assignee) Nokia Oyj ; Intellectual Ventures I LLC

Kari Jarvinen, Janne Vainio, Petri Haavisto, Tero Honkanen
US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6170073B1
CLAIM 4
. An encoder according to claim 3 , wherein the first coding means is a speech encoder (last frame, replacement frame) and the second coding means is a channel encoder .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6170073B1
CLAIM 4
. An encoder according to claim 3 , wherein the first coding means is a speech encoder (last frame, replacement frame) and the second coding means is a channel encoder .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6170073B1
CLAIM 4
. An encoder according to claim 3 , wherein the first coding means is a speech encoder (last frame, replacement frame) and the second coding means is a channel encoder .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6170073B1
CLAIM 4
. An encoder according to claim 3 , wherein the first coding means is a speech encoder (last frame, replacement frame) and the second coding means is a channel encoder .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6170073B1
CLAIM 4
. An encoder according to claim 3 , wherein the first coding means is a speech encoder (last frame, replacement frame) and the second coding means is a channel encoder .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6170073B1
CLAIM 4
. An encoder according to claim 3 , wherein the first coding means is a speech encoder (last frame, replacement frame) and the second coding means is a channel encoder .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6292834B1

Filed: 1997-03-14     Issued: 2001-09-18

Dynamic bandwidth selection for efficient transmission of multimedia streams in a computer network

(Original Assignee) Microsoft Corp     (Current Assignee) Microsoft Technology Licensing LLC

Hemanth Srinivas Ravi, Anders Edgar Klemets, Navin Chaddha, David de Val
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (incoming data) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (first time) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time (onset frame) of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (incoming data) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (incoming data) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (incoming data) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (incoming data) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (incoming data) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (incoming data) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (incoming data) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (incoming data) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (first time) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time (onset frame) of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (incoming data) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (incoming data) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (incoming data) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (incoming data) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (incoming data) , limits to a given value a gain used for scaling the synthesized sound signal .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (incoming data) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery (incoming data) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6292834B1
CLAIM 1
. In a client computer , a method of dynamically adjusting a data transmission rate of a data stream from a server to the client computer , the client computer coupled to the server via a computer network , the computer network providing a network connection with a variable bandwidth for transmitting the data stream , the data stream including a plurality of data packets , said client computer including a playout buffer for buffering the incoming data (decoder recovery, decoder determines concealment) packets , the method comprising : initializing the data transmission rate for said data stream ;
dynamically computing a decrement bandwidth (DEC BW) threshold for the playout buffer ;
and decrementing the transmission rate upon determining that a difference between a first time of a first data packet and a last time of a last data packet in said playout buffer drops below said dynamically computed DEC BW threshold .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH10232699A

Filed: 1997-02-21     Issued: 1998-09-02

Lpcボコーダ

(Original Assignee) Japan Radio Co Ltd; 日本無線株式会社     

Akihiro Nakahara, 聡宏 中原
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (フレーム間) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータを抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

JPH10232699A
CLAIM 4
【請求項4】 音声信号を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータを抽出し、この特徴パラ メータに対応する各フレームデータを合成する際、前記 ピッチを示す特徴パラメータをフレーム間 (onset frame) で滑らかに変 化するよう補間し、これに併せて合成する各フレームデ ータのフレーム長を調整して該音声信号を復元するLP Cボコーダにおいて、 前記特徴パラメータの抽出時に前記パワーの抽出にあた って、合成時に調整されるフレーム長に対応して抽出対 象フレームデータ長を調節する手段を備えたことを特徴 とするLPCボコーダ。

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter (特徴パラメータ) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter (特徴パラメータ) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter (特徴パラメータ) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter (特徴パラメータ) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータを抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータを抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter (特徴パラメータ) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (特徴パラメータ) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (特徴パラメータ) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (特徴パラメータ) related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (フレーム間) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータを抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

JPH10232699A
CLAIM 4
【請求項4】 音声信号を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータを抽出し、この特徴パラ メータに対応する各フレームデータを合成する際、前記 ピッチを示す特徴パラメータをフレーム間 (onset frame) で滑らかに変 化するよう補間し、これに併せて合成する各フレームデ ータのフレーム長を調整して該音声信号を復元するLP Cボコーダにおいて、 前記特徴パラメータの抽出時に前記パワーの抽出にあた って、合成時に調整されるフレーム長に対応して抽出対 象フレームデータ長を調節する手段を備えたことを特徴 とするLPCボコーダ。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (特徴パラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (特徴パラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (特徴パラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (特徴パラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータを抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータを抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (特徴パラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (特徴パラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (特徴パラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (特徴パラメータ) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (特徴パラメータ) related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JPH10232699A
CLAIM 1
【請求項1】 音声信号 (sound signal, speech signal) を一定時間長のフレームデータ 毎に分割し、各フレームデータの、少なくとも有声音, 無声音の別、パワー、フレームデータ合成時に使用する 声道等価フィルタの特性係数および有声音の場合にはそ のピッチを含む特徴パラメータ (recovery parameters, phase information parameter) を抽出し、この特徴パラ メータに対応する各フレームデータを合成することによ り該音声信号を復元するLPCボコーダにおいて、 有声音のフレームデータの合成時に生成する励起信号に 2次パルスを付加する手段を備えたことを特徴とするL PCボコーダ。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6317714B1

Filed: 1997-02-04     Issued: 2001-11-13

Controller and associated mechanical characters operable for continuously performing received control data while engaging in bidirectional communications over a single communications channel

(Original Assignee) Microsoft Corp     (Current Assignee) Chartoleaux KG LLC

Leonardo Del Castillo, Damon Vincent Danieli, Scott Randell, Craig S. Ranta, Harjit Singh
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (first time) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6317714B1
CLAIM 23
. A communications system for providing bandwidth efficient , bidirectional communications over a single communications channel for controlling the operation of remote devices , comprising : a computer system including a processing unit , a memory storage device coupled to the processing unit of the computer system , a master program module located in the memory storage device for providing instructions to the processing unit of the computer system , a speaker , and a display device , a link master controller , being functionally connected to the computer system and including a processing unit , and a transceiver system ;
a remote device including a transceiver system , a speech synthesizer , a speaker , at least one motion servo motors , a data buffer , and at least one sensor device ;
the computer system being operative to continuously provide an audio/video presentation on the speaker and display device ;
during a first time (onset frame) period , the computer system being operative to : retrieve control data from the memory storage device , the control data having a relationship with the current state of the audio/video presentation , and provide control data to the link master controller ;
the link master controller being operative to : receive the control data from the control system , encode the control data to reduce bandwidth requirements , and transmit the encoded control data to the remote device ;
and the remote device being operative to : receive the encoded control data , decode the encoded control data , place the control data into the data buffer , and operate on the control data in the data buffer by actuating at least one motion servo motor and providing data to the speech synthesizer ;
and during a second time period , the remote device being operative to : operate on the control data in the data buffer by actuating at least one motion servo motors and providing data to the speech synthesizer , formulate a response message based on the status of at least one sensor device , encode the response message , and transmit the encoded response message to the link master controller ;
the link master controller being operative to : receive the encoded response message from the remote device , decode the encoded response message , and provide the response message to the computer system ;
and the computer system being operative to : receive the response data from the link master controller , and alter the audio/video presentation in accordance with the response data .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (period T) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6317714B1
CLAIM 16
. A system for utilizing a single communications channel for bi-directional communication between two devices , comprising : a first transceiver configured for transmitting control data for causing a remote device to respond in a manner dictated by the control data ;
a second transceiver , functionally connected to the first transceiver by the single communications channel ;
a data buffer , coupled to the second transceiver for storing the control data ;
a means for expending the control data from the data buffer and delivering the control data for performance by the remote device ;
during a time period T (E q) F , the first transceiver being operative to transmit the control data to the second transceiver over the single communications channel at a transfer rate R F , and the second transceiver being operative to receive the control data at the transfer rate R F , store the control data into the data buffer , and expend the control data from the data buffer at a consumption rate R C , the consumption rate R C being less than the transfer rate R F ;
during a time period T R , the second transceiver being operative to transmit response data to the first transceiver over the single communications channel at a transfer rate R R , while expending the control data from the data buffer at the consumption rate R C to cause the remote device to continuously perform in response to the control data while the first and second transceivers engage in bidirectional communications .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (period T) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6317714B1
CLAIM 16
. A system for utilizing a single communications channel for bi-directional communication between two devices , comprising : a first transceiver configured for transmitting control data for causing a remote device to respond in a manner dictated by the control data ;
a second transceiver , functionally connected to the first transceiver by the single communications channel ;
a data buffer , coupled to the second transceiver for storing the control data ;
a means for expending the control data from the data buffer and delivering the control data for performance by the remote device ;
during a time period T (E q) F , the first transceiver being operative to transmit the control data to the second transceiver over the single communications channel at a transfer rate R F , and the second transceiver being operative to receive the control data at the transfer rate R F , store the control data into the data buffer , and expend the control data from the data buffer at a consumption rate R C , the consumption rate R C being less than the transfer rate R F ;
during a time period T R , the second transceiver being operative to transmit response data to the first transceiver over the single communications channel at a transfer rate R R , while expending the control data from the data buffer at the consumption rate R C to cause the remote device to continuously perform in response to the control data while the first and second transceivers engage in bidirectional communications .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (first time) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6317714B1
CLAIM 23
. A communications system for providing bandwidth efficient , bidirectional communications over a single communications channel for controlling the operation of remote devices , comprising : a computer system including a processing unit , a memory storage device coupled to the processing unit of the computer system , a master program module located in the memory storage device for providing instructions to the processing unit of the computer system , a speaker , and a display device , a link master controller , being functionally connected to the computer system and including a processing unit , and a transceiver system ;
a remote device including a transceiver system , a speech synthesizer , a speaker , at least one motion servo motors , a data buffer , and at least one sensor device ;
the computer system being operative to continuously provide an audio/video presentation on the speaker and display device ;
during a first time (onset frame) period , the computer system being operative to : retrieve control data from the memory storage device , the control data having a relationship with the current state of the audio/video presentation , and provide control data to the link master controller ;
the link master controller being operative to : receive the control data from the control system , encode the control data to reduce bandwidth requirements , and transmit the encoded control data to the remote device ;
and the remote device being operative to : receive the encoded control data , decode the encoded control data , place the control data into the data buffer , and operate on the control data in the data buffer by actuating at least one motion servo motor and providing data to the speech synthesizer ;
and during a second time period , the remote device being operative to : operate on the control data in the data buffer by actuating at least one motion servo motors and providing data to the speech synthesizer , formulate a response message based on the status of at least one sensor device , encode the response message , and transmit the encoded response message to the link master controller ;
the link master controller being operative to : receive the encoded response message from the remote device , decode the encoded response message , and provide the response message to the computer system ;
and the computer system being operative to : receive the response data from the link master controller , and alter the audio/video presentation in accordance with the response data .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (period T) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6317714B1
CLAIM 16
. A system for utilizing a single communications channel for bi-directional communication between two devices , comprising : a first transceiver configured for transmitting control data for causing a remote device to respond in a manner dictated by the control data ;
a second transceiver , functionally connected to the first transceiver by the single communications channel ;
a data buffer , coupled to the second transceiver for storing the control data ;
a means for expending the control data from the data buffer and delivering the control data for performance by the remote device ;
during a time period T (E q) F , the first transceiver being operative to transmit the control data to the second transceiver over the single communications channel at a transfer rate R F , and the second transceiver being operative to receive the control data at the transfer rate R F , store the control data into the data buffer , and expend the control data from the data buffer at a consumption rate R C , the consumption rate R C being less than the transfer rate R F ;
during a time period T R , the second transceiver being operative to transmit response data to the first transceiver over the single communications channel at a transfer rate R R , while expending the control data from the data buffer at the consumption rate R C to cause the remote device to continuously perform in response to the control data while the first and second transceivers engage in bidirectional communications .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (period T) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6317714B1
CLAIM 16
. A system for utilizing a single communications channel for bi-directional communication between two devices , comprising : a first transceiver configured for transmitting control data for causing a remote device to respond in a manner dictated by the control data ;
a second transceiver , functionally connected to the first transceiver by the single communications channel ;
a data buffer , coupled to the second transceiver for storing the control data ;
a means for expending the control data from the data buffer and delivering the control data for performance by the remote device ;
during a time period T (E q) F , the first transceiver being operative to transmit the control data to the second transceiver over the single communications channel at a transfer rate R F , and the second transceiver being operative to receive the control data at the transfer rate R F , store the control data into the data buffer , and expend the control data from the data buffer at a consumption rate R C , the consumption rate R C being less than the transfer rate R F ;
during a time period T R , the second transceiver being operative to transmit response data to the first transceiver over the single communications channel at a transfer rate R R , while expending the control data from the data buffer at the consumption rate R C to cause the remote device to continuously perform in response to the control data while the first and second transceivers engage in bidirectional communications .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5819213A

Filed: 1997-01-30     Issued: 1998-10-06

Speech encoding and decoding with pitch filter range unrestricted by codebook range and preselecting, then increasing, search candidates from linear overlap codebooks

(Original Assignee) Toshiba Corp     (Current Assignee) Toshiba Corp

Masahiro Oshikiri, Tadashi Amada, Masami Akamine, Kimio Miseki
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (decoding apparatus) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number (last non) of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number (last non) of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number (last non) of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs (decoding apparatus) , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (decoding apparatus) , limits to a given value a gain used for scaling the synthesized sound signal .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number (last non) of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number (last non) of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (residual error) and decoder recovery (decoding apparatus) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5819213A
CLAIM 2
. A method according to claim 1 , wherein the codebook uses an adaptive codebook (sound signal, speech signal) expressing a plurality of pitch periods within a predetermined search range and a noise codebook expressing a noise string within a predetermined number (last non) of candidates , and the searching of the codebook includes searching the adaptive codebook and the noise codebook on the basis of the analysis result and combining a pitch period and a noise string by which the distortion is minimized .

US5819213A
CLAIM 4
. A method according to claim 3 , further comprising calculating a prediction residual error (frame concealment, decoder determines concealment) signal of the input speech signal by using the LPC coefficient , calculating , on the basis of a signal obtained by multiplying the prediction residual error signal by a Hamming window , an autocorrelation value within a predetermined pitch period analysis range , calculating a pitch period at which the autocorrelation value is a maximum , and calculating the pitch filter coefficient from the prediction residual error signal and the pitch period .

US5819213A
CLAIM 15
. A speech decoding apparatus (decoder recovery, decoder constructs) comprising : means for analyzing a pitch period of a decoded speech signal obtained by decoding encoded data ;
a post filter including a pitch filter for emphasizing a pitch period component of the decoded speech signal ;
and means for setting an analysis range of the pitch period to be supplied to the pitch filter so that the analysis range is wider than a range of a pitch period which can be expressed by the encoded data .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6085158A

Filed: 1997-01-22     Issued: 2000-07-04

Updating internal states of a speech decoder after errors have occurred

(Original Assignee) NTT Mobile Communications Networks Inc     (Current Assignee) NTT Docomo Inc

Nobuhiko Naka, Tomoyuki Ohya
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder (last frame, replacement frame) code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment (error concealment) and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (error concealment) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder (last frame, replacement frame) code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment (error concealment) and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder (last frame, replacement frame) code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 13
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment (error concealment) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 14
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 15
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 16
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 17
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment) and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder (last frame, replacement frame) code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment (error concealment) and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 20
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (error concealment) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder (last frame, replacement frame) code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 22
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 23
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 24
. A device for conducting concealment (error concealment) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (error concealment) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment (error concealment) and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6085158A
CLAIM 1
. A speech decoder that receives from a speech encoder (last frame, replacement frame) code sequences representative of speech data and decodes said code sequences to recreate said speech data , said speech decoder comprising : (a) an error detecting unit for detecting whether a correct or incorrect code sequence is received ;
(b) a decoding unit for decoding said code sequences in succession , using internal decoding information , wherein said decoding unit , each time decoding one code sequence , updates said internal decoding information based on decoding of said one code sequence for decoding of a next code sequence ;
(c) a regular routine unit for supplying correct code sequences to said decoding unit for decoding to recreate said speech data ;
(d) an error concealment (decoder concealment, frame erasure concealment, frame concealment, determining concealment, conducting concealment, decoder determines concealment) unit responsive to receipt of an incorrect code sequence for constructing a code sequence of first estimation , using a prior correct code sequence , and supplying said code sequence of first estimation to said decoding unit for decoding to recreate said speech data ;
and (e) an error recovery unit responsive to receipt of one or more correct code sequences subsequent to receipt of one or more incorrect code sequences for constructing one or more code sequences of second estimation , using said one or more correct code sequences , and supplying said one or more code sequences of second estimation to said decoding unit for decoding , wherein said decoding unit , as decoding said one or more code sequences of second estimation , redoes the updating of said internal decoding information previously done based on decoding of one or more code sequences of first estimation constructed in response to receipt of said one or more incorrect code sequences in order to make said internal decoding information coincident with internal encoding information of said speech encoder .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH10233692A

Filed: 1997-01-16     Issued: 1998-09-02

オーディオ信号符号化装置および符号化方法並びにオーディオ信号復号装置および復号方法

(Original Assignee) Sony Corp; ソニー株式会社     

Masaaki Isozaki, 正明 五十崎
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (エラー) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response (選択指示) up to the end of a last subframe affected by the artificial construction of the periodic part .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

JPH10233692A
CLAIM 11
【請求項11】 請求項8において、 受信された複数のビットストリームをそれぞれ複数の階 層データへ復号化する手段と、 選択指示 (preceding impulse response) 信号に基づいて、復号された複数の階層データ を加算、または上記複数の階層データの一部を選択する 手段とからなることを特徴とするオーディオ信号復号装 置。

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (エラー) and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (エラー) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

JPH10233692A
CLAIM 9
【請求項9】 ディジタルオーディオ信号を周波数に基 づいて複数の階層データに分離し、上記複数の階層デー タを符号化して、複数のビットストリームを生成し、上 記複数のビットストリームを受信し、復号するオーディ オ信号復号装置において、 受信された複数のビットストリームを複数の階層データ へ復号すると共に、複数の階層データの全体またはその 一部を適応的に選択する手段を有すること (maximum amplitude) を特徴とする オーディオ信号復号装置。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (エラー) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (エラー) and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment (エラー) and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (エラー) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (ワーク) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH10233692A
CLAIM 6
【請求項6】 請求項1において、 ビットストリームを転送するためのネットワーク (current frame, replacement frame) の混雑 の程度を検出し、上記ネットワークの混雑の程度に応じ て受信側が使用する階層データを指示する情報を送信す ることを特徴とするオーディオ信号符号化装置。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH10233692A
CLAIM 9
【請求項9】 ディジタルオーディオ信号を周波数に基 づいて複数の階層データに分離し、上記複数の階層デー タを符号化して、複数のビットストリームを生成し、上 記複数のビットストリームを受信し、復号するオーディ オ信号復号装置において、 受信された複数のビットストリームを複数の階層データ へ復号すると共に、複数の階層データの全体またはその 一部を適応的に選択する手段を有すること (maximum amplitude) を特徴とする オーディオ信号復号装置。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (ワーク) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment (エラー) and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (ワーク) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

JPH10233692A
CLAIM 6
【請求項6】 請求項1において、 ビットストリームを転送するためのネットワーク (current frame, replacement frame) の混雑 の程度を検出し、上記ネットワークの混雑の程度に応じ て受信側が使用する階層データを指示する情報を送信す ることを特徴とするオーディオ信号符号化装置。

US7693710B2
CLAIM 13
. A device for conducting concealment (エラー) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment (エラー) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response (選択指示) up to an end of a last subframe affected by the artificial construction of the periodic part .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

JPH10233692A
CLAIM 11
【請求項11】 請求項8において、 受信された複数のビットストリームをそれぞれ複数の階 層データへ復号化する手段と、 選択指示 (preceding impulse response) 信号に基づいて、復号された複数の階層データ を加算、または上記複数の階層データの一部を選択する 手段とからなることを特徴とするオーディオ信号復号装 置。

US7693710B2
CLAIM 14
. A device for conducting concealment (エラー) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (エラー) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

US7693710B2
CLAIM 15
. A device for conducting concealment (エラー) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (エラー) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

JPH10233692A
CLAIM 9
【請求項9】 ディジタルオーディオ信号を周波数に基 づいて複数の階層データに分離し、上記複数の階層デー タを符号化して、複数のビットストリームを生成し、上 記複数のビットストリームを受信し、復号するオーディ オ信号復号装置において、 受信された複数のビットストリームを複数の階層データ へ復号すると共に、複数の階層データの全体またはその 一部を適応的に選択する手段を有すること (maximum amplitude) を特徴とする オーディオ信号復号装置。

US7693710B2
CLAIM 16
. A device for conducting concealment (エラー) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (エラー) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

US7693710B2
CLAIM 17
. A device for conducting concealment (エラー) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (エラー) and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment (エラー) and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

US7693710B2
CLAIM 20
. A device for conducting concealment (エラー) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment (エラー) and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (ワーク) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH10233692A
CLAIM 6
【請求項6】 請求項1において、 ビットストリームを転送するためのネットワーク (current frame, replacement frame) の混雑 の程度を検出し、上記ネットワークの混雑の程度に応じ て受信側が使用する階層データを指示する情報を送信す ることを特徴とするオーディオ信号符号化装置。

US7693710B2
CLAIM 22
. A device for conducting concealment (エラー) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

US7693710B2
CLAIM 23
. A device for conducting concealment (エラー) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

JPH10233692A
CLAIM 9
【請求項9】 ディジタルオーディオ信号を周波数に基 づいて複数の階層データに分離し、上記複数の階層デー タを符号化して、複数のビットストリームを生成し、上 記複数のビットストリームを受信し、復号するオーディ オ信号復号装置において、 受信された複数のビットストリームを複数の階層データ へ復号すると共に、複数の階層データの全体またはその 一部を適応的に選択する手段を有すること (maximum amplitude) を特徴とする オーディオ信号復号装置。

US7693710B2
CLAIM 24
. A device for conducting concealment (エラー) of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (ワーク) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (エラー) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment (エラー) and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (ワーク) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JPH10233692A
CLAIM 2
【請求項2】 請求項1において、 さらに、複数のビットストリームに対してエラー (decoder concealment, frame erasure concealment, frame concealment, conducting concealment) 訂正符 号化を行うエラー訂正符号化手段を有し、 上記複数の階層データの内の低域側の階層データに関し て、上記エラー訂正符号化により生じる訂正能力を高域 側の階層データに比して高くすることを特徴とするオー ディオ信号符号化装置。

JPH10233692A
CLAIM 6
【請求項6】 請求項1において、 ビットストリームを転送するためのネットワーク (current frame, replacement frame) の混雑 の程度を検出し、上記ネットワークの混雑の程度に応じ て受信側が使用する階層データを指示する情報を送信す ることを特徴とするオーディオ信号符号化装置。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5806024A

Filed: 1996-12-23     Issued: 1998-09-08

Coding of a speech or music signal with quantization of harmonics components specifically and then residue components

(Original Assignee) NEC Corp     (Current Assignee) NEC Corp

Kazunori Ozawa
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (successive frames) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse responses, response signal, inverse filter) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse responses, response signal, inverse filter) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5806024A
CLAIM 8
. A signal encoding device comprising : a spectral parameter quantizer for quantizing spectral parameters of a device input signal into quantized parameters and for converting said quantized parameters into linear prediction coefficients ;
an inverse filter (impulse responses, impulse response) responsive to said linear prediction coefficients for producing an inverse filtered signal ;
a first orthogonal transform circuit responsive to said inverse filtered signal for calculating a first orthogonal transform of said device input signal to produce primary coefficients of said first orthogonal transform ;
a pitch extractor for extracting a pitch frequency from said device input signal ;
a harmonics estimating circuit responsive to said pitch frequency for estimating harmonics locations on said primary coefficients to produce harmonics coefficients at said harmonics locations ;
an impulse response calculating circuit for calculating auditorily weighted impulse responses (impulse responses, impulse response) of said linear prediction coefficients to produce an impulse response signal (impulse responses, impulse response) representative of said auditorily weighted impulse responses ;
a second orthogonal transform circuit responsive to said impulse response signal for calculating a second orthogonal transform of said impulse response signal to produce secondary coefficients of said second orthogonal transform ;
a harmonics quantizer for quantizing said harmonics coefficients jointly as a representative coefficient by using said secondary coefficients into a harmonics code vector representative of a quantized representative coefficient ;
and a residue quantizer for quantizing residue coefficients into residue code vectors and gain code vectors , said residue coefficients being given by removing said quantized representative coefficient from said primary coefficients ;
whereby said device input signal is encoded into a device output signal comprising indexes indicative of said quantized parameters , said harmonics code vector , said residue code vectors , and said gain code vectors .

US5806024A
CLAIM 18
. A signal encoding device comprising : an orthogonal transform circuit responsive to a device input signal for calculating an input orthogonal transform of said device input signal to produce input orthogonal transform coefficients of said input orthogonal transform ;
a pitch extracting circuit for extracting a pitch frequency from each of successive frames (onset frame) of said device input signal and for discriminating said successive frames between a voiced and an unvoiced frame ;
a pulse searching circuit for repeatedly searching in said voiced frame a voiced frame pulse sequence of primary excitation pulses by using said pitch frequency and in said unvoiced frame an unvoiced frame pulse sequence of secondary excitation pulses without using said pitch frequency ;
a harmonics quantizer for quantizing said primary excitation pulses jointly as a representative pulse into a pulse code vector representative of a quantized representative coefficient ;
and a residue quantizer for quantizing residue coefficients into residue code vectors and gain code vectors , said residue coefficients being given by removing said quantized representative coefficient from said orthogonal transform coefficients ;
whereby said device input signal is encoded into a device output signal comprising a pitch internal of said pitch frequency , information separately indicative of said voiced and said unvoiced frames , and indexes indicative of pulse positions of said primary and said secondary excitation pulses , said pulse code vector , said residue code vectors , and said gain code vectors .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5806024A
CLAIM 8
. A signal encoding device comprising : a spectral parameter quantizer for quantizing spectral parameters of a device input signal into quantized parameters and for converting said quantized parameters into linear prediction coefficient (comfort noise) s ;
an inverse filter responsive to said linear prediction coefficients for producing an inverse filtered signal ;
a first orthogonal transform circuit responsive to said inverse filtered signal for calculating a first orthogonal transform of said device input signal to produce primary coefficients of said first orthogonal transform ;
a pitch extractor for extracting a pitch frequency from said device input signal ;
a harmonics estimating circuit responsive to said pitch frequency for estimating harmonics locations on said primary coefficients to produce harmonics coefficients at said harmonics locations ;
an impulse response calculating circuit for calculating auditorily weighted impulse responses of said linear prediction coefficients to produce an impulse response signal representative of said auditorily weighted impulse responses ;
a second orthogonal transform circuit responsive to said impulse response signal for calculating a second orthogonal transform of said impulse response signal to produce secondary coefficients of said second orthogonal transform ;
a harmonics quantizer for quantizing said harmonics coefficients jointly as a representative coefficient by using said secondary coefficients into a harmonics code vector representative of a quantized representative coefficient ;
and a residue quantizer for quantizing residue coefficients into residue code vectors and gain code vectors , said residue coefficients being given by removing said quantized representative coefficient from said primary coefficients ;
whereby said device input signal is encoded into a device output signal comprising indexes indicative of said quantized parameters , said harmonics code vector , said residue code vectors , and said gain code vectors .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse responses, response signal, inverse filter) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5806024A
CLAIM 8
. A signal encoding device comprising : a spectral parameter quantizer for quantizing spectral parameters of a device input signal into quantized parameters and for converting said quantized parameters into linear prediction coefficients ;
an inverse filter (impulse responses, impulse response) responsive to said linear prediction coefficients for producing an inverse filtered signal ;
a first orthogonal transform circuit responsive to said inverse filtered signal for calculating a first orthogonal transform of said device input signal to produce primary coefficients of said first orthogonal transform ;
a pitch extractor for extracting a pitch frequency from said device input signal ;
a harmonics estimating circuit responsive to said pitch frequency for estimating harmonics locations on said primary coefficients to produce harmonics coefficients at said harmonics locations ;
an impulse response calculating circuit for calculating auditorily weighted impulse responses (impulse responses, impulse response) of said linear prediction coefficients to produce an impulse response signal (impulse responses, impulse response) representative of said auditorily weighted impulse responses ;
a second orthogonal transform circuit responsive to said impulse response signal for calculating a second orthogonal transform of said impulse response signal to produce secondary coefficients of said second orthogonal transform ;
a harmonics quantizer for quantizing said harmonics coefficients jointly as a representative coefficient by using said secondary coefficients into a harmonics code vector representative of a quantized representative coefficient ;
and a residue quantizer for quantizing residue coefficients into residue code vectors and gain code vectors , said residue coefficients being given by removing said quantized representative coefficient from said primary coefficients ;
whereby said device input signal is encoded into a device output signal comprising indexes indicative of said quantized parameters , said harmonics code vector , said residue code vectors , and said gain code vectors .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse responses, response signal, inverse filter) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5806024A
CLAIM 8
. A signal encoding device comprising : a spectral parameter quantizer for quantizing spectral parameters of a device input signal into quantized parameters and for converting said quantized parameters into linear prediction coefficients ;
an inverse filter (impulse responses, impulse response) responsive to said linear prediction coefficients for producing an inverse filtered signal ;
a first orthogonal transform circuit responsive to said inverse filtered signal for calculating a first orthogonal transform of said device input signal to produce primary coefficients of said first orthogonal transform ;
a pitch extractor for extracting a pitch frequency from said device input signal ;
a harmonics estimating circuit responsive to said pitch frequency for estimating harmonics locations on said primary coefficients to produce harmonics coefficients at said harmonics locations ;
an impulse response calculating circuit for calculating auditorily weighted impulse responses (impulse responses, impulse response) of said linear prediction coefficients to produce an impulse response signal (impulse responses, impulse response) representative of said auditorily weighted impulse responses ;
a second orthogonal transform circuit responsive to said impulse response signal for calculating a second orthogonal transform of said impulse response signal to produce secondary coefficients of said second orthogonal transform ;
a harmonics quantizer for quantizing said harmonics coefficients jointly as a representative coefficient by using said secondary coefficients into a harmonics code vector representative of a quantized representative coefficient ;
and a residue quantizer for quantizing residue coefficients into residue code vectors and gain code vectors , said residue coefficients being given by removing said quantized representative coefficient from said primary coefficients ;
whereby said device input signal is encoded into a device output signal comprising indexes indicative of said quantized parameters , said harmonics code vector , said residue code vectors , and said gain code vectors .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (successive frames) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse responses, response signal, inverse filter) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse responses, response signal, inverse filter) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5806024A
CLAIM 8
. A signal encoding device comprising : a spectral parameter quantizer for quantizing spectral parameters of a device input signal into quantized parameters and for converting said quantized parameters into linear prediction coefficients ;
an inverse filter (impulse responses, impulse response) responsive to said linear prediction coefficients for producing an inverse filtered signal ;
a first orthogonal transform circuit responsive to said inverse filtered signal for calculating a first orthogonal transform of said device input signal to produce primary coefficients of said first orthogonal transform ;
a pitch extractor for extracting a pitch frequency from said device input signal ;
a harmonics estimating circuit responsive to said pitch frequency for estimating harmonics locations on said primary coefficients to produce harmonics coefficients at said harmonics locations ;
an impulse response calculating circuit for calculating auditorily weighted impulse responses (impulse responses, impulse response) of said linear prediction coefficients to produce an impulse response signal (impulse responses, impulse response) representative of said auditorily weighted impulse responses ;
a second orthogonal transform circuit responsive to said impulse response signal for calculating a second orthogonal transform of said impulse response signal to produce secondary coefficients of said second orthogonal transform ;
a harmonics quantizer for quantizing said harmonics coefficients jointly as a representative coefficient by using said secondary coefficients into a harmonics code vector representative of a quantized representative coefficient ;
and a residue quantizer for quantizing residue coefficients into residue code vectors and gain code vectors , said residue coefficients being given by removing said quantized representative coefficient from said primary coefficients ;
whereby said device input signal is encoded into a device output signal comprising indexes indicative of said quantized parameters , said harmonics code vector , said residue code vectors , and said gain code vectors .

US5806024A
CLAIM 18
. A signal encoding device comprising : an orthogonal transform circuit responsive to a device input signal for calculating an input orthogonal transform of said device input signal to produce input orthogonal transform coefficients of said input orthogonal transform ;
a pitch extracting circuit for extracting a pitch frequency from each of successive frames (onset frame) of said device input signal and for discriminating said successive frames between a voiced and an unvoiced frame ;
a pulse searching circuit for repeatedly searching in said voiced frame a voiced frame pulse sequence of primary excitation pulses by using said pitch frequency and in said unvoiced frame an unvoiced frame pulse sequence of secondary excitation pulses without using said pitch frequency ;
a harmonics quantizer for quantizing said primary excitation pulses jointly as a representative pulse into a pulse code vector representative of a quantized representative coefficient ;
and a residue quantizer for quantizing residue coefficients into residue code vectors and gain code vectors , said residue coefficients being given by removing said quantized representative coefficient from said orthogonal transform coefficients ;
whereby said device input signal is encoded into a device output signal comprising a pitch internal of said pitch frequency , information separately indicative of said voiced and said unvoiced frames , and indexes indicative of pulse positions of said primary and said secondary excitation pulses , said pulse code vector , said residue code vectors , and said gain code vectors .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5806024A
CLAIM 8
. A signal encoding device comprising : a spectral parameter quantizer for quantizing spectral parameters of a device input signal into quantized parameters and for converting said quantized parameters into linear prediction coefficient (comfort noise) s ;
an inverse filter responsive to said linear prediction coefficients for producing an inverse filtered signal ;
a first orthogonal transform circuit responsive to said inverse filtered signal for calculating a first orthogonal transform of said device input signal to produce primary coefficients of said first orthogonal transform ;
a pitch extractor for extracting a pitch frequency from said device input signal ;
a harmonics estimating circuit responsive to said pitch frequency for estimating harmonics locations on said primary coefficients to produce harmonics coefficients at said harmonics locations ;
an impulse response calculating circuit for calculating auditorily weighted impulse responses of said linear prediction coefficients to produce an impulse response signal representative of said auditorily weighted impulse responses ;
a second orthogonal transform circuit responsive to said impulse response signal for calculating a second orthogonal transform of said impulse response signal to produce secondary coefficients of said second orthogonal transform ;
a harmonics quantizer for quantizing said harmonics coefficients jointly as a representative coefficient by using said secondary coefficients into a harmonics code vector representative of a quantized representative coefficient ;
and a residue quantizer for quantizing residue coefficients into residue code vectors and gain code vectors , said residue coefficients being given by removing said quantized representative coefficient from said primary coefficients ;
whereby said device input signal is encoded into a device output signal comprising indexes indicative of said quantized parameters , said harmonics code vector , said residue code vectors , and said gain code vectors .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse responses, response signal, inverse filter) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5806024A
CLAIM 8
. A signal encoding device comprising : a spectral parameter quantizer for quantizing spectral parameters of a device input signal into quantized parameters and for converting said quantized parameters into linear prediction coefficients ;
an inverse filter (impulse responses, impulse response) responsive to said linear prediction coefficients for producing an inverse filtered signal ;
a first orthogonal transform circuit responsive to said inverse filtered signal for calculating a first orthogonal transform of said device input signal to produce primary coefficients of said first orthogonal transform ;
a pitch extractor for extracting a pitch frequency from said device input signal ;
a harmonics estimating circuit responsive to said pitch frequency for estimating harmonics locations on said primary coefficients to produce harmonics coefficients at said harmonics locations ;
an impulse response calculating circuit for calculating auditorily weighted impulse responses (impulse responses, impulse response) of said linear prediction coefficients to produce an impulse response signal (impulse responses, impulse response) representative of said auditorily weighted impulse responses ;
a second orthogonal transform circuit responsive to said impulse response signal for calculating a second orthogonal transform of said impulse response signal to produce secondary coefficients of said second orthogonal transform ;
a harmonics quantizer for quantizing said harmonics coefficients jointly as a representative coefficient by using said secondary coefficients into a harmonics code vector representative of a quantized representative coefficient ;
and a residue quantizer for quantizing residue coefficients into residue code vectors and gain code vectors , said residue coefficients being given by removing said quantized representative coefficient from said primary coefficients ;
whereby said device input signal is encoded into a device output signal comprising indexes indicative of said quantized parameters , said harmonics code vector , said residue code vectors , and said gain code vectors .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse responses, response signal, inverse filter) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5806024A
CLAIM 8
. A signal encoding device comprising : a spectral parameter quantizer for quantizing spectral parameters of a device input signal into quantized parameters and for converting said quantized parameters into linear prediction coefficients ;
an inverse filter (impulse responses, impulse response) responsive to said linear prediction coefficients for producing an inverse filtered signal ;
a first orthogonal transform circuit responsive to said inverse filtered signal for calculating a first orthogonal transform of said device input signal to produce primary coefficients of said first orthogonal transform ;
a pitch extractor for extracting a pitch frequency from said device input signal ;
a harmonics estimating circuit responsive to said pitch frequency for estimating harmonics locations on said primary coefficients to produce harmonics coefficients at said harmonics locations ;
an impulse response calculating circuit for calculating auditorily weighted impulse responses (impulse responses, impulse response) of said linear prediction coefficients to produce an impulse response signal (impulse responses, impulse response) representative of said auditorily weighted impulse responses ;
a second orthogonal transform circuit responsive to said impulse response signal for calculating a second orthogonal transform of said impulse response signal to produce secondary coefficients of said second orthogonal transform ;
a harmonics quantizer for quantizing said harmonics coefficients jointly as a representative coefficient by using said secondary coefficients into a harmonics code vector representative of a quantized representative coefficient ;
and a residue quantizer for quantizing residue coefficients into residue code vectors and gain code vectors , said residue coefficients being given by removing said quantized representative coefficient from said primary coefficients ;
whereby said device input signal is encoded into a device output signal comprising indexes indicative of said quantized parameters , said harmonics code vector , said residue code vectors , and said gain code vectors .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6173265B1

Filed: 1996-12-23     Issued: 2001-01-09

Voice recording and/or reproducing method and apparatus for reducing a deterioration of a voice signal due to a change over from one coding device to another coding device

(Original Assignee) Olympus Corp     (Current Assignee) Olympus Corp

Hidetaka Takahashi
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (coding device) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US6173265B1
CLAIM 12
. A voice recording device , comprising : a controller ;
at least one coding device (pitch period) in communication with the controller , the at least one coding device being capable of coding an input voice signal in accordance with a selected one of a plurality of bit rates in order to produce coded voice data ;
a bit rate selection device in communication with the controller ;
a voice data memory in communication with the controller ;
and a deterioration reducing device in communication with the controller and responsive to a selection of a switching operation from the selected one of the plurality of bit rates to another selected one of the plurality of bit rates in order to prevent a deterioration of the coded voice data due to the switch over from coding according to the selected one of the plurality of bit rates to coding according to the other selected one of the plurality of bit rates , the deterioration being prevented by the deterioration reducing device by continuing a coding according to the selected one of the plurality of bit rates after the other one of the plurality of bit rates is selected until a predetermined feature is detected by the deterioration reducing device in the input voice signal and without a delay being imposed on the coded voice data by the deterioration reducing device , wherein the deterioration reducing device permits the coding to switch over to the other selected one of the plurality of bit rates when the predetermined feature is detected in the input voice signal .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (coding device) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US6173265B1
CLAIM 12
. A voice recording device , comprising : a controller ;
at least one coding device (pitch period) in communication with the controller , the at least one coding device being capable of coding an input voice signal in accordance with a selected one of a plurality of bit rates in order to produce coded voice data ;
a bit rate selection device in communication with the controller ;
a voice data memory in communication with the controller ;
and a deterioration reducing device in communication with the controller and responsive to a selection of a switching operation from the selected one of the plurality of bit rates to another selected one of the plurality of bit rates in order to prevent a deterioration of the coded voice data due to the switch over from coding according to the selected one of the plurality of bit rates to coding according to the other selected one of the plurality of bit rates , the deterioration being prevented by the deterioration reducing device by continuing a coding according to the selected one of the plurality of bit rates after the other one of the plurality of bit rates is selected until a predetermined feature is detected by the deterioration reducing device in the input voice signal and without a delay being imposed on the coded voice data by the deterioration reducing device , wherein the deterioration reducing device permits the coding to switch over to the other selected one of the plurality of bit rates when the predetermined feature is detected in the input voice signal .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (coding device) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US6173265B1
CLAIM 12
. A voice recording device , comprising : a controller ;
at least one coding device (pitch period) in communication with the controller , the at least one coding device being capable of coding an input voice signal in accordance with a selected one of a plurality of bit rates in order to produce coded voice data ;
a bit rate selection device in communication with the controller ;
a voice data memory in communication with the controller ;
and a deterioration reducing device in communication with the controller and responsive to a selection of a switching operation from the selected one of the plurality of bit rates to another selected one of the plurality of bit rates in order to prevent a deterioration of the coded voice data due to the switch over from coding according to the selected one of the plurality of bit rates to coding according to the other selected one of the plurality of bit rates , the deterioration being prevented by the deterioration reducing device by continuing a coding according to the selected one of the plurality of bit rates after the other one of the plurality of bit rates is selected until a predetermined feature is detected by the deterioration reducing device in the input voice signal and without a delay being imposed on the coded voice data by the deterioration reducing device , wherein the deterioration reducing device permits the coding to switch over to the other selected one of the plurality of bit rates when the predetermined feature is detected in the input voice signal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (coding device) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US6173265B1
CLAIM 12
. A voice recording device , comprising : a controller ;
at least one coding device (pitch period) in communication with the controller , the at least one coding device being capable of coding an input voice signal in accordance with a selected one of a plurality of bit rates in order to produce coded voice data ;
a bit rate selection device in communication with the controller ;
a voice data memory in communication with the controller ;
and a deterioration reducing device in communication with the controller and responsive to a selection of a switching operation from the selected one of the plurality of bit rates to another selected one of the plurality of bit rates in order to prevent a deterioration of the coded voice data due to the switch over from coding according to the selected one of the plurality of bit rates to coding according to the other selected one of the plurality of bit rates , the deterioration being prevented by the deterioration reducing device by continuing a coding according to the selected one of the plurality of bit rates after the other one of the plurality of bit rates is selected until a predetermined feature is detected by the deterioration reducing device in the input voice signal and without a delay being imposed on the coded voice data by the deterioration reducing device , wherein the deterioration reducing device permits the coding to switch over to the other selected one of the plurality of bit rates when the predetermined feature is detected in the input voice signal .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (coding device) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US6173265B1
CLAIM 12
. A voice recording device , comprising : a controller ;
at least one coding device (pitch period) in communication with the controller , the at least one coding device being capable of coding an input voice signal in accordance with a selected one of a plurality of bit rates in order to produce coded voice data ;
a bit rate selection device in communication with the controller ;
a voice data memory in communication with the controller ;
and a deterioration reducing device in communication with the controller and responsive to a selection of a switching operation from the selected one of the plurality of bit rates to another selected one of the plurality of bit rates in order to prevent a deterioration of the coded voice data due to the switch over from coding according to the selected one of the plurality of bit rates to coding according to the other selected one of the plurality of bit rates , the deterioration being prevented by the deterioration reducing device by continuing a coding according to the selected one of the plurality of bit rates after the other one of the plurality of bit rates is selected until a predetermined feature is detected by the deterioration reducing device in the input voice signal and without a delay being imposed on the coded voice data by the deterioration reducing device , wherein the deterioration reducing device permits the coding to switch over to the other selected one of the plurality of bit rates when the predetermined feature is detected in the input voice signal .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (coding device) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US6173265B1
CLAIM 12
. A voice recording device , comprising : a controller ;
at least one coding device (pitch period) in communication with the controller , the at least one coding device being capable of coding an input voice signal in accordance with a selected one of a plurality of bit rates in order to produce coded voice data ;
a bit rate selection device in communication with the controller ;
a voice data memory in communication with the controller ;
and a deterioration reducing device in communication with the controller and responsive to a selection of a switching operation from the selected one of the plurality of bit rates to another selected one of the plurality of bit rates in order to prevent a deterioration of the coded voice data due to the switch over from coding according to the selected one of the plurality of bit rates to coding according to the other selected one of the plurality of bit rates , the deterioration being prevented by the deterioration reducing device by continuing a coding according to the selected one of the plurality of bit rates after the other one of the plurality of bit rates is selected until a predetermined feature is detected by the deterioration reducing device in the input voice signal and without a delay being imposed on the coded voice data by the deterioration reducing device , wherein the deterioration reducing device permits the coding to switch over to the other selected one of the plurality of bit rates when the predetermined feature is detected in the input voice signal .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6173265B1
CLAIM 4
. The device according to claim 3 , wherein at least one of the plurality of voice coding means includes an adaptive codebook (sound signal, speech signal) updated by the quantized data , wherein the control means resets the contents of the adaptive codebook when another voice coding means is selected .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH09321783A

Filed: 1996-12-09     Issued: 1997-12-12

音声符号化伝送システム

(Original Assignee) Mitsubishi Electric Corp; 三菱電機株式会社     

Noriaki Kono, Hisashi Naito, Shigeaki Suzuki, Hisashi Yajima, 悠史 内藤, 典明 河野, 久 矢島, 茂明 鈴木
US7693710B2
CLAIM 1
. A method of concealing frame erasure (ない時) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (検知器) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JPH09321783A
CLAIM 23
【請求項23】 音声信号を高能率符号化し、得られた 原音声符号を分割してセルを構成し、このセルを非同期 転送モード伝送路に出力する送信ノードと、前記非同期 転送モード伝送路から受信した前記セルを分解して原音 声符号を取り出し、この原音声符号を同期をとって同期 転送モード伝送路に出力する中継ノードと、前記同期転 送モード伝送路から受信した音声符号を復号処理して音 声信号を出力する受信ノードとを含んだ音声符号化伝送 システムにおいて、 前記中継ノードは、 受信されたセルから前記非同期転送モード伝送路でのセ ル消失を検知し、これに基づき当該中継ノードの動作を 制御する中継制御信号を出力する中継制御手段と、 前記セル消失により欠落した前記原音声符号を、受信し た前記原音声符号に基づいて補って中継音声符号を生成 する音声符号修復部と、 前記中継制御信号に基づいて、前記同期転送モード伝送 路に前記原音声符号と前記中継音声符号とのいずれを出 力するかを切り替える切替器であって、前記セル消失の 検知時には前記中継音声符号を出力し、前記セル消失を 検知しない時 (concealing frame erasure) には、前記原音声符号を出力する出力切替 器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。

US7693710B2
CLAIM 2
. A method of concealing frame erasure (ない時) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (検知器) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 23
【請求項23】 音声信号を高能率符号化し、得られた 原音声符号を分割してセルを構成し、このセルを非同期 転送モード伝送路に出力する送信ノードと、前記非同期 転送モード伝送路から受信した前記セルを分解して原音 声符号を取り出し、この原音声符号を同期をとって同期 転送モード伝送路に出力する中継ノードと、前記同期転 送モード伝送路から受信した音声符号を復号処理して音 声信号を出力する受信ノードとを含んだ音声符号化伝送 システムにおいて、 前記中継ノードは、 受信されたセルから前記非同期転送モード伝送路でのセ ル消失を検知し、これに基づき当該中継ノードの動作を 制御する中継制御信号を出力する中継制御手段と、 前記セル消失により欠落した前記原音声符号を、受信し た前記原音声符号に基づいて補って中継音声符号を生成 する音声符号修復部と、 前記中継制御信号に基づいて、前記同期転送モード伝送 路に前記原音声符号と前記中継音声符号とのいずれを出 力するかを切り替える切替器であって、前記セル消失の 検知時には前記中継音声符号を出力し、前記セル消失を 検知しない時 (concealing frame erasure) には、前記原音声符号を出力する出力切替 器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。

US7693710B2
CLAIM 3
. A method of concealing frame erasure (ない時) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (検知器) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 23
【請求項23】 音声信号を高能率符号化し、得られた 原音声符号を分割してセルを構成し、このセルを非同期 転送モード伝送路に出力する送信ノードと、前記非同期 転送モード伝送路から受信した前記セルを分解して原音 声符号を取り出し、この原音声符号を同期をとって同期 転送モード伝送路に出力する中継ノードと、前記同期転 送モード伝送路から受信した音声符号を復号処理して音 声信号を出力する受信ノードとを含んだ音声符号化伝送 システムにおいて、 前記中継ノードは、 受信されたセルから前記非同期転送モード伝送路でのセ ル消失を検知し、これに基づき当該中継ノードの動作を 制御する中継制御信号を出力する中継制御手段と、 前記セル消失により欠落した前記原音声符号を、受信し た前記原音声符号に基づいて補って中継音声符号を生成 する音声符号修復部と、 前記中継制御信号に基づいて、前記同期転送モード伝送 路に前記原音声符号と前記中継音声符号とのいずれを出 力するかを切り替える切替器であって、前記セル消失の 検知時には前記中継音声符号を出力し、前記セル消失を 検知しない時 (concealing frame erasure) には、前記原音声符号を出力する出力切替 器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。

US7693710B2
CLAIM 4
. A method of concealing frame erasure (ない時) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (検知器) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (の音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 20
【請求項20】 上記受信ノードは、 人工的な雑音である擬似背景雑音を出力する擬似背景雑 音信号発生器と、 受信ノードからの出力源を選択する切換器であって、上 記無音圧縮音声符号の開始から上記遅延時間経過するま では、前記擬似背景雑音信号発生器からの擬似背景雑音 を出力し、その後は上記受信復号器からの音声信号 (speech signal) を出 力する受信出力切換器と、 を有することを特徴とする請求項19記載の音声符号化 伝送システム。

JPH09321783A
CLAIM 23
【請求項23】 音声信号を高能率符号化し、得られた 原音声符号を分割してセルを構成し、このセルを非同期 転送モード伝送路に出力する送信ノードと、前記非同期 転送モード伝送路から受信した前記セルを分解して原音 声符号を取り出し、この原音声符号を同期をとって同期 転送モード伝送路に出力する中継ノードと、前記同期転 送モード伝送路から受信した音声符号を復号処理して音 声信号を出力する受信ノードとを含んだ音声符号化伝送 システムにおいて、 前記中継ノードは、 受信されたセルから前記非同期転送モード伝送路でのセ ル消失を検知し、これに基づき当該中継ノードの動作を 制御する中継制御信号を出力する中継制御手段と、 前記セル消失により欠落した前記原音声符号を、受信し た前記原音声符号に基づいて補って中継音声符号を生成 する音声符号修復部と、 前記中継制御信号に基づいて、前記同期転送モード伝送 路に前記原音声符号と前記中継音声符号とのいずれを出 力するかを切り替える切替器であって、前記セル消失の 検知時には前記中継音声符号を出力し、前記セル消失を 検知しない時 (concealing frame erasure) には、前記原音声符号を出力する出力切替 器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。

US7693710B2
CLAIM 5
. A method of concealing frame erasure (ない時) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (検知器) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 23
【請求項23】 音声信号を高能率符号化し、得られた 原音声符号を分割してセルを構成し、このセルを非同期 転送モード伝送路に出力する送信ノードと、前記非同期 転送モード伝送路から受信した前記セルを分解して原音 声符号を取り出し、この原音声符号を同期をとって同期 転送モード伝送路に出力する中継ノードと、前記同期転 送モード伝送路から受信した音声符号を復号処理して音 声信号を出力する受信ノードとを含んだ音声符号化伝送 システムにおいて、 前記中継ノードは、 受信されたセルから前記非同期転送モード伝送路でのセ ル消失を検知し、これに基づき当該中継ノードの動作を 制御する中継制御信号を出力する中継制御手段と、 前記セル消失により欠落した前記原音声符号を、受信し た前記原音声符号に基づいて補って中継音声符号を生成 する音声符号修復部と、 前記中継制御信号に基づいて、前記同期転送モード伝送 路に前記原音声符号と前記中継音声符号とのいずれを出 力するかを切り替える切替器であって、前記セル消失の 検知時には前記中継音声符号を出力し、前記セル消失を 検知しない時 (concealing frame erasure) には、前記原音声符号を出力する出力切替 器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (の音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (検知器) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JPH09321783A
CLAIM 20
【請求項20】 上記受信ノードは、 人工的な雑音である擬似背景雑音を出力する擬似背景雑 音信号発生器と、 受信ノードからの出力源を選択する切換器であって、上 記無音圧縮音声符号の開始から上記遅延時間経過するま では、前記擬似背景雑音信号発生器からの擬似背景雑音 を出力し、その後は上記受信復号器からの音声信号 (speech signal) を出 力する受信出力切換器と、 を有することを特徴とする請求項19記載の音声符号化 伝送システム。

JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (の音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH09321783A
CLAIM 20
【請求項20】 上記受信ノードは、 人工的な雑音である擬似背景雑音を出力する擬似背景雑 音信号発生器と、 受信ノードからの出力源を選択する切換器であって、上 記無音圧縮音声符号の開始から上記遅延時間経過するま では、前記擬似背景雑音信号発生器からの擬似背景雑音 を出力し、その後は上記受信復号器からの音声信号 (speech signal) を出 力する受信出力切換器と、 を有することを特徴とする請求項19記載の音声符号化 伝送システム。

US7693710B2
CLAIM 8
. A method of concealing frame erasure (ない時) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (検知器) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 23
【請求項23】 音声信号を高能率符号化し、得られた 原音声符号を分割してセルを構成し、このセルを非同期 転送モード伝送路に出力する送信ノードと、前記非同期 転送モード伝送路から受信した前記セルを分解して原音 声符号を取り出し、この原音声符号を同期をとって同期 転送モード伝送路に出力する中継ノードと、前記同期転 送モード伝送路から受信した音声符号を復号処理して音 声信号を出力する受信ノードとを含んだ音声符号化伝送 システムにおいて、 前記中継ノードは、 受信されたセルから前記非同期転送モード伝送路でのセ ル消失を検知し、これに基づき当該中継ノードの動作を 制御する中継制御信号を出力する中継制御手段と、 前記セル消失により欠落した前記原音声符号を、受信し た前記原音声符号に基づいて補って中継音声符号を生成 する音声符号修復部と、 前記中継制御信号に基づいて、前記同期転送モード伝送 路に前記原音声符号と前記中継音声符号とのいずれを出 力するかを切り替える切替器であって、前記セル消失の 検知時には前記中継音声符号を出力し、前記セル消失を 検知しない時 (concealing frame erasure) には、前記原音声符号を出力する出力切替 器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。

US7693710B2
CLAIM 10
. A method of concealing frame erasure (ない時) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 23
【請求項23】 音声信号を高能率符号化し、得られた 原音声符号を分割してセルを構成し、このセルを非同期 転送モード伝送路に出力する送信ノードと、前記非同期 転送モード伝送路から受信した前記セルを分解して原音 声符号を取り出し、この原音声符号を同期をとって同期 転送モード伝送路に出力する中継ノードと、前記同期転 送モード伝送路から受信した音声符号を復号処理して音 声信号を出力する受信ノードとを含んだ音声符号化伝送 システムにおいて、 前記中継ノードは、 受信されたセルから前記非同期転送モード伝送路でのセ ル消失を検知し、これに基づき当該中継ノードの動作を 制御する中継制御信号を出力する中継制御手段と、 前記セル消失により欠落した前記原音声符号を、受信し た前記原音声符号に基づいて補って中継音声符号を生成 する音声符号修復部と、 前記中継制御信号に基づいて、前記同期転送モード伝送 路に前記原音声符号と前記中継音声符号とのいずれを出 力するかを切り替える切替器であって、前記セル消失の 検知時には前記中継音声符号を出力し、前記セル消失を 検知しない時 (concealing frame erasure) には、前記原音声符号を出力する出力切替 器と、 を有することを特徴とする音声符号化伝送システム。

US7693710B2
CLAIM 11
. A method of concealing frame erasure (ない時) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 23
【請求項23】 音声信号を高能率符号化し、得られた 原音声符号を分割してセルを構成し、このセルを非同期 転送モード伝送路に出力する送信ノードと、前記非同期 転送モード伝送路から受信した前記セルを分解して原音 声符号を取り出し、この原音声符号を同期をとって同期 転送モード伝送路に出力する中継ノードと、前記同期転 送モード伝送路から受信した音声符号を復号処理して音 声信号を出力する受信ノードとを含んだ音声符号化伝送 システムにおいて、 前記中継ノードは、 受信されたセルから前記非同期転送モード伝送路でのセ ル消失を検知し、これに基づき当該中継ノードの動作を 制御する中継制御信号を出力する中継制御手段と、 前記セル消失により欠落した前記原音声符号を、受信し た前記原音声符号に基づいて補って中継音声符号を生成 する音声符号修復部と、 前記中継制御信号に基づいて、前記同期転送モード伝送 路に前記原音声符号と前記中継音声符号とのいずれを出 力するかを切り替える切替器であって、前記セル消失の 検知時には前記中継音声符号を出力し、前記セル消失を 検知しない時 (concealing frame erasure) には、前記原音声符号を出力する出力切替 器と、 を有することを特徴とする音声符号化伝送システム。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (検知器) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (検知器) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (検知器) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (検知器) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (検知器) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (の音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 20
【請求項20】 上記受信ノードは、 人工的な雑音である擬似背景雑音を出力する擬似背景雑 音信号発生器と、 受信ノードからの出力源を選択する切換器であって、上 記無音圧縮音声符号の開始から上記遅延時間経過するま では、前記擬似背景雑音信号発生器からの擬似背景雑音 を出力し、その後は上記受信復号器からの音声信号 (speech signal) を出 力する受信出力切換器と、 を有することを特徴とする請求項19記載の音声符号化 伝送システム。

JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (検知器) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (の音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (検知器) , limits to a given value a gain used for scaling the synthesized sound signal .
JPH09321783A
CLAIM 20
【請求項20】 上記受信ノードは、 人工的な雑音である擬似背景雑音を出力する擬似背景雑 音信号発生器と、 受信ノードからの出力源を選択する切換器であって、上 記無音圧縮音声符号の開始から上記遅延時間経過するま では、前記擬似背景雑音信号発生器からの擬似背景雑音 を出力し、その後は上記受信復号器からの音声信号 (speech signal) を出 力する受信出力切換器と、 を有することを特徴とする請求項19記載の音声符号化 伝送システム。

JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (の音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH09321783A
CLAIM 20
【請求項20】 上記受信ノードは、 人工的な雑音である擬似背景雑音を出力する擬似背景雑 音信号発生器と、 受信ノードからの出力源を選択する切換器であって、上 記無音圧縮音声符号の開始から上記遅延時間経過するま では、前記擬似背景雑音信号発生器からの擬似背景雑音 を出力し、その後は上記受信復号器からの音声信号 (speech signal) を出 力する受信出力切換器と、 を有することを特徴とする請求項19記載の音声符号化 伝送システム。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (検知器) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (の音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 20
【請求項20】 上記受信ノードは、 人工的な雑音である擬似背景雑音を出力する擬似背景雑 音信号発生器と、 受信ノードからの出力源を選択する切換器であって、上 記無音圧縮音声符号の開始から上記遅延時間経過するま では、前記擬似背景雑音信号発生器からの擬似背景雑音 を出力し、その後は上記受信復号器からの音声信号 (speech signal) を出 力する受信出力切換器と、 を有することを特徴とする請求項19記載の音声符号化 伝送システム。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (値決定) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery (検知器) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JPH09321783A
CLAIM 1
【請求項1】 音声信号を差分符号化し音声符号である 原音声符号を第1の伝送路に出力する送信ノードと、前 記第1の伝送路から受信した原音声符号に基づいて音声 信号の有音期間に対応する音声符号のみを選択して第2 の伝送路に出力することにより無音圧縮を行う中継ノー ドと、前記第2の伝送路から受信した無音圧縮音声符号 を復号処理して音声信号を出力する受信ノードとを含ん だ音声符号化伝送システムにおいて、 前記中継ノードは、 前記原音声符号から音声信号に含まれる音声情報を取り 出す中継復号器と、 この音声情報に基づいて前記音声信号の有音期間・無音 期間を判別し、これに基づき中継ノードの動作を制御す る中継制御信号を出力する中継制御手段と、 前記中継制御信号に基づいて、前記無音期間から前記有 音期間に遷移するタイミングである音声開始時における 前記差分符号化の基準値を決定する符号化基準値決定 (signal classification parameter) 手 段と、 前記音声開始時において、この記基準値に基づいて前記 音声情報の前記差分符号化を開始し、少なくとも一定の 過渡期間、中継音声符号を生成する中継符号化器と、 前記原音声符号と前記中継音声符号とが入力され、前記 第2の伝送路に、前記中継制御信号に基づいて、前記過 渡期間内では前記中継音声符号を出力し、前記過渡期間 以降の有音期間では前記原音声符号を出力して、前記無 音圧縮音声符号を合成する無音圧縮手段と、を有し、 前記受信ノードは、 前記無音圧縮音声符号に基づいて前記音声開始を判別 し、これに基づき受信ノードの動作を制御する受信制御 信号を出力する受信制御手段と、 前記受信制御信号に基づいて、差分符号化の前記基準値 に対応した前記復号処理の基準値を前記音声開始時にお いて決定する復号基準値決定手段と、 前記音声開始時において、この復号処理の基準値に基づ いて前記無音圧縮音声符号の前記復号処理を開始し、前 記音声信号を出力する受信復号器と、 を有することを特徴とする音声符号化伝送システム。

JPH09321783A
CLAIM 26
【請求項26】 上記中継ノードは、 上記中継音声符号を復号する検査復号器と、 前記検査復号器の出力音声信号に含まれる異音成分を検 知する異音検知器 (decoder recovery) と、 前記異音成分の検知時には、入力された上記中継音声符 号を修正して出力する音声符号修正器と、を有し、 上記出力切替器は、上記原音声符号と、前記音声符号修 正器から出力される中継音声符号とを切り替えて出力す ること、 を特徴とする請求項25記載の音声符号化伝送システ ム。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5819212A

Filed: 1996-10-24     Issued: 1998-10-06

Voice encoding method and apparatus using modified discrete cosine transform

(Original Assignee) Sony Corp     (Current Assignee) Sony Corp

Jun Matsumoto, Shiro Omori, Masayuki Nishiguchi, Kazuyuki Iijima
US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (discrete cosine, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5819212A
CLAIM 1
. A signal encoding method comprising the steps of : splitting an input signal into a plurality of frequency bands ;
encoding signals of said each of the plurality of frequency bands in respective manners depending on signal characteristics of said each of the plurality of frequency bands ;
splitting the input speech signal (communication link, decoder determines concealment, speech signal) into a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
performing a short-term prediction on the signals of the second frequency band for finding short-term prediction residuals ;
performing a long-term prediction on the short-term prediction residuals for finding long-term prediction residuals ;
and orthogonal-transforming the long-term prediction residuals using a modified discrete cosine (communication link, decoder determines concealment, speech signal) transform for the orthogonal transform step with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (said transmission) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5819212A
CLAIM 7
. A portable radio terminal apparatus including an antenna , the apparatus comprising : first amplifier means for amplifying an input speech signal to provide a first amplified signal ;
A/D conversion means for A/D converting the first amplified signal ;
speech encoding means for encoding an output of said A/D conversion means to provide an encoded signal ;
transmission path encoding means for channel-coding said encoded signal ;
modulation means for modulating an output of said transmission (first non, last non) path encoding means to provide a modulated signal ;
D/A conversion means for D/A converting said modulated signal ;
and second amplifier means for amplifying a signal from said D/A conversion means to provide a second amplified signal and for supplying the second amplified signal to the antenna ;
wherein said speech encoding means includes : band-splitting means for splitting the output of said A/D conversion means into a plurality of frequency bands , wherein the plurality of frequency bands include a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
and encoding means for encoding signals of each of said plurality of frequency bands in respective manners responsive to signal characteristics of said each of the plurality of frequency bands and for multiplexing a first signal of one of the plurality of frequency bands and a portion of a second signal of another of the plurality of frequency bands that is not in common with said first signal ;
means for finding short-term prediction residuals by a short-term prediction performed on a signal of a lowest one of said plurality of frequency bands ;
means for finding long-term prediction residuals by performing a long-term prediction on the short-term prediction residuals ;
and orthogonal transform means for orthogonal-transforming the long-term prediction residuals using a modified discrete cosine transform for the orthogonal transform with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (discrete cosine, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (said transmission) erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5819212A
CLAIM 1
. A signal encoding method comprising the steps of : splitting an input signal into a plurality of frequency bands ;
encoding signals of said each of the plurality of frequency bands in respective manners depending on signal characteristics of said each of the plurality of frequency bands ;
splitting the input speech signal (communication link, decoder determines concealment, speech signal) into a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
performing a short-term prediction on the signals of the second frequency band for finding short-term prediction residuals ;
performing a long-term prediction on the short-term prediction residuals for finding long-term prediction residuals ;
and orthogonal-transforming the long-term prediction residuals using a modified discrete cosine (communication link, decoder determines concealment, speech signal) transform for the orthogonal transform step with a predetermined transform length selected to be a power of 2 .

US5819212A
CLAIM 7
. A portable radio terminal apparatus including an antenna , the apparatus comprising : first amplifier means for amplifying an input speech signal to provide a first amplified signal ;
A/D conversion means for A/D converting the first amplified signal ;
speech encoding means for encoding an output of said A/D conversion means to provide an encoded signal ;
transmission path encoding means for channel-coding said encoded signal ;
modulation means for modulating an output of said transmission (first non, last non) path encoding means to provide a modulated signal ;
D/A conversion means for D/A converting said modulated signal ;
and second amplifier means for amplifying a signal from said D/A conversion means to provide a second amplified signal and for supplying the second amplified signal to the antenna ;
wherein said speech encoding means includes : band-splitting means for splitting the output of said A/D conversion means into a plurality of frequency bands , wherein the plurality of frequency bands include a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
and encoding means for encoding signals of each of said plurality of frequency bands in respective manners responsive to signal characteristics of said each of the plurality of frequency bands and for multiplexing a first signal of one of the plurality of frequency bands and a portion of a second signal of another of the plurality of frequency bands that is not in common with said first signal ;
means for finding short-term prediction residuals by a short-term prediction performed on a signal of a lowest one of said plurality of frequency bands ;
means for finding long-term prediction residuals by performing a long-term prediction on the short-term prediction residuals ;
and orthogonal transform means for orthogonal-transforming the long-term prediction residuals using a modified discrete cosine transform for the orthogonal transform with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (discrete cosine, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (said transmission) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (said transmission) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5819212A
CLAIM 1
. A signal encoding method comprising the steps of : splitting an input signal into a plurality of frequency bands ;
encoding signals of said each of the plurality of frequency bands in respective manners depending on signal characteristics of said each of the plurality of frequency bands ;
splitting the input speech signal (communication link, decoder determines concealment, speech signal) into a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
performing a short-term prediction on the signals of the second frequency band for finding short-term prediction residuals ;
performing a long-term prediction on the short-term prediction residuals for finding long-term prediction residuals ;
and orthogonal-transforming the long-term prediction residuals using a modified discrete cosine (communication link, decoder determines concealment, speech signal) transform for the orthogonal transform step with a predetermined transform length selected to be a power of 2 .

US5819212A
CLAIM 7
. A portable radio terminal apparatus including an antenna , the apparatus comprising : first amplifier means for amplifying an input speech signal to provide a first amplified signal ;
A/D conversion means for A/D converting the first amplified signal ;
speech encoding means for encoding an output of said A/D conversion means to provide an encoded signal ;
transmission path encoding means for channel-coding said encoded signal ;
modulation means for modulating an output of said transmission (first non, last non) path encoding means to provide a modulated signal ;
D/A conversion means for D/A converting said modulated signal ;
and second amplifier means for amplifying a signal from said D/A conversion means to provide a second amplified signal and for supplying the second amplified signal to the antenna ;
wherein said speech encoding means includes : band-splitting means for splitting the output of said A/D conversion means into a plurality of frequency bands , wherein the plurality of frequency bands include a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
and encoding means for encoding signals of each of said plurality of frequency bands in respective manners responsive to signal characteristics of said each of the plurality of frequency bands and for multiplexing a first signal of one of the plurality of frequency bands and a portion of a second signal of another of the plurality of frequency bands that is not in common with said first signal ;
means for finding short-term prediction residuals by a short-term prediction performed on a signal of a lowest one of said plurality of frequency bands ;
means for finding long-term prediction residuals by performing a long-term prediction on the short-term prediction residuals ;
and orthogonal transform means for orthogonal-transforming the long-term prediction residuals using a modified discrete cosine transform for the orthogonal transform with a predetermined transform length selected to be a power of 2 .

US5819212A
CLAIM 9
. The multiplexing method as claimed in claim 8 , wherein said first portion is data obtained by a linear predictive analysis of said input signal followed by quantization of parameters representing linear prediction coefficient (comfort noise) s .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (said transmission) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5819212A
CLAIM 7
. A portable radio terminal apparatus including an antenna , the apparatus comprising : first amplifier means for amplifying an input speech signal to provide a first amplified signal ;
A/D conversion means for A/D converting the first amplified signal ;
speech encoding means for encoding an output of said A/D conversion means to provide an encoded signal ;
transmission path encoding means for channel-coding said encoded signal ;
modulation means for modulating an output of said transmission (first non, last non) path encoding means to provide a modulated signal ;
D/A conversion means for D/A converting said modulated signal ;
and second amplifier means for amplifying a signal from said D/A conversion means to provide a second amplified signal and for supplying the second amplified signal to the antenna ;
wherein said speech encoding means includes : band-splitting means for splitting the output of said A/D conversion means into a plurality of frequency bands , wherein the plurality of frequency bands include a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
and encoding means for encoding signals of each of said plurality of frequency bands in respective manners responsive to signal characteristics of said each of the plurality of frequency bands and for multiplexing a first signal of one of the plurality of frequency bands and a portion of a second signal of another of the plurality of frequency bands that is not in common with said first signal ;
means for finding short-term prediction residuals by a short-term prediction performed on a signal of a lowest one of said plurality of frequency bands ;
means for finding long-term prediction residuals by performing a long-term prediction on the short-term prediction residuals ;
and orthogonal transform means for orthogonal-transforming the long-term prediction residuals using a modified discrete cosine transform for the orthogonal transform with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non (said transmission) erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (said transmission) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5819212A
CLAIM 7
. A portable radio terminal apparatus including an antenna , the apparatus comprising : first amplifier means for amplifying an input speech signal to provide a first amplified signal ;
A/D conversion means for A/D converting the first amplified signal ;
speech encoding means for encoding an output of said A/D conversion means to provide an encoded signal ;
transmission path encoding means for channel-coding said encoded signal ;
modulation means for modulating an output of said transmission (first non, last non) path encoding means to provide a modulated signal ;
D/A conversion means for D/A converting said modulated signal ;
and second amplifier means for amplifying a signal from said D/A conversion means to provide a second amplified signal and for supplying the second amplified signal to the antenna ;
wherein said speech encoding means includes : band-splitting means for splitting the output of said A/D conversion means into a plurality of frequency bands , wherein the plurality of frequency bands include a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
and encoding means for encoding signals of each of said plurality of frequency bands in respective manners responsive to signal characteristics of said each of the plurality of frequency bands and for multiplexing a first signal of one of the plurality of frequency bands and a portion of a second signal of another of the plurality of frequency bands that is not in common with said first signal ;
means for finding short-term prediction residuals by a short-term prediction performed on a signal of a lowest one of said plurality of frequency bands ;
means for finding long-term prediction residuals by performing a long-term prediction on the short-term prediction residuals ;
and orthogonal transform means for orthogonal-transforming the long-term prediction residuals using a modified discrete cosine transform for the orthogonal transform with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (said transmission) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (said transmission) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5819212A
CLAIM 7
. A portable radio terminal apparatus including an antenna , the apparatus comprising : first amplifier means for amplifying an input speech signal to provide a first amplified signal ;
A/D conversion means for A/D converting the first amplified signal ;
speech encoding means for encoding an output of said A/D conversion means to provide an encoded signal ;
transmission path encoding means for channel-coding said encoded signal ;
modulation means for modulating an output of said transmission (first non, last non) path encoding means to provide a modulated signal ;
D/A conversion means for D/A converting said modulated signal ;
and second amplifier means for amplifying a signal from said D/A conversion means to provide a second amplified signal and for supplying the second amplified signal to the antenna ;
wherein said speech encoding means includes : band-splitting means for splitting the output of said A/D conversion means into a plurality of frequency bands , wherein the plurality of frequency bands include a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
and encoding means for encoding signals of each of said plurality of frequency bands in respective manners responsive to signal characteristics of said each of the plurality of frequency bands and for multiplexing a first signal of one of the plurality of frequency bands and a portion of a second signal of another of the plurality of frequency bands that is not in common with said first signal ;
means for finding short-term prediction residuals by a short-term prediction performed on a signal of a lowest one of said plurality of frequency bands ;
means for finding long-term prediction residuals by performing a long-term prediction on the short-term prediction residuals ;
and orthogonal transform means for orthogonal-transforming the long-term prediction residuals using a modified discrete cosine transform for the orthogonal transform with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link (discrete cosine, speech signal) for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5819212A
CLAIM 1
. A signal encoding method comprising the steps of : splitting an input signal into a plurality of frequency bands ;
encoding signals of said each of the plurality of frequency bands in respective manners depending on signal characteristics of said each of the plurality of frequency bands ;
splitting the input speech signal (communication link, decoder determines concealment, speech signal) into a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
performing a short-term prediction on the signals of the second frequency band for finding short-term prediction residuals ;
performing a long-term prediction on the short-term prediction residuals for finding long-term prediction residuals ;
and orthogonal-transforming the long-term prediction residuals using a modified discrete cosine (communication link, decoder determines concealment, speech signal) transform for the orthogonal transform step with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (discrete cosine, speech signal) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5819212A
CLAIM 1
. A signal encoding method comprising the steps of : splitting an input signal into a plurality of frequency bands ;
encoding signals of said each of the plurality of frequency bands in respective manners depending on signal characteristics of said each of the plurality of frequency bands ;
splitting the input speech signal (communication link, decoder determines concealment, speech signal) into a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
performing a short-term prediction on the signals of the second frequency band for finding short-term prediction residuals ;
performing a long-term prediction on the short-term prediction residuals for finding long-term prediction residuals ;
and orthogonal-transforming the long-term prediction residuals using a modified discrete cosine (communication link, decoder determines concealment, speech signal) transform for the orthogonal transform step with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (discrete cosine, speech signal) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5819212A
CLAIM 1
. A signal encoding method comprising the steps of : splitting an input signal into a plurality of frequency bands ;
encoding signals of said each of the plurality of frequency bands in respective manners depending on signal characteristics of said each of the plurality of frequency bands ;
splitting the input speech signal (communication link, decoder determines concealment, speech signal) into a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
performing a short-term prediction on the signals of the second frequency band for finding short-term prediction residuals ;
performing a long-term prediction on the short-term prediction residuals for finding long-term prediction residuals ;
and orthogonal-transforming the long-term prediction residuals using a modified discrete cosine (communication link, decoder determines concealment, speech signal) transform for the orthogonal transform step with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (discrete cosine, speech signal) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (discrete cosine, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5819212A
CLAIM 1
. A signal encoding method comprising the steps of : splitting an input signal into a plurality of frequency bands ;
encoding signals of said each of the plurality of frequency bands in respective manners depending on signal characteristics of said each of the plurality of frequency bands ;
splitting the input speech signal (communication link, decoder determines concealment, speech signal) into a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
performing a short-term prediction on the signals of the second frequency band for finding short-term prediction residuals ;
performing a long-term prediction on the short-term prediction residuals for finding long-term prediction residuals ;
and orthogonal-transforming the long-term prediction residuals using a modified discrete cosine (communication link, decoder determines concealment, speech signal) transform for the orthogonal transform step with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (discrete cosine, speech signal) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (said transmission) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5819212A
CLAIM 1
. A signal encoding method comprising the steps of : splitting an input signal into a plurality of frequency bands ;
encoding signals of said each of the plurality of frequency bands in respective manners depending on signal characteristics of said each of the plurality of frequency bands ;
splitting the input speech signal (communication link, decoder determines concealment, speech signal) into a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
performing a short-term prediction on the signals of the second frequency band for finding short-term prediction residuals ;
performing a long-term prediction on the short-term prediction residuals for finding long-term prediction residuals ;
and orthogonal-transforming the long-term prediction residuals using a modified discrete cosine (communication link, decoder determines concealment, speech signal) transform for the orthogonal transform step with a predetermined transform length selected to be a power of 2 .

US5819212A
CLAIM 7
. A portable radio terminal apparatus including an antenna , the apparatus comprising : first amplifier means for amplifying an input speech signal to provide a first amplified signal ;
A/D conversion means for A/D converting the first amplified signal ;
speech encoding means for encoding an output of said A/D conversion means to provide an encoded signal ;
transmission path encoding means for channel-coding said encoded signal ;
modulation means for modulating an output of said transmission (first non, last non) path encoding means to provide a modulated signal ;
D/A conversion means for D/A converting said modulated signal ;
and second amplifier means for amplifying a signal from said D/A conversion means to provide a second amplified signal and for supplying the second amplified signal to the antenna ;
wherein said speech encoding means includes : band-splitting means for splitting the output of said A/D conversion means into a plurality of frequency bands , wherein the plurality of frequency bands include a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
and encoding means for encoding signals of each of said plurality of frequency bands in respective manners responsive to signal characteristics of said each of the plurality of frequency bands and for multiplexing a first signal of one of the plurality of frequency bands and a portion of a second signal of another of the plurality of frequency bands that is not in common with said first signal ;
means for finding short-term prediction residuals by a short-term prediction performed on a signal of a lowest one of said plurality of frequency bands ;
means for finding long-term prediction residuals by performing a long-term prediction on the short-term prediction residuals ;
and orthogonal transform means for orthogonal-transforming the long-term prediction residuals using a modified discrete cosine transform for the orthogonal transform with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (discrete cosine, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (said transmission) erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5819212A
CLAIM 1
. A signal encoding method comprising the steps of : splitting an input signal into a plurality of frequency bands ;
encoding signals of said each of the plurality of frequency bands in respective manners depending on signal characteristics of said each of the plurality of frequency bands ;
splitting the input speech signal (communication link, decoder determines concealment, speech signal) into a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
performing a short-term prediction on the signals of the second frequency band for finding short-term prediction residuals ;
performing a long-term prediction on the short-term prediction residuals for finding long-term prediction residuals ;
and orthogonal-transforming the long-term prediction residuals using a modified discrete cosine (communication link, decoder determines concealment, speech signal) transform for the orthogonal transform step with a predetermined transform length selected to be a power of 2 .

US5819212A
CLAIM 7
. A portable radio terminal apparatus including an antenna , the apparatus comprising : first amplifier means for amplifying an input speech signal to provide a first amplified signal ;
A/D conversion means for A/D converting the first amplified signal ;
speech encoding means for encoding an output of said A/D conversion means to provide an encoded signal ;
transmission path encoding means for channel-coding said encoded signal ;
modulation means for modulating an output of said transmission (first non, last non) path encoding means to provide a modulated signal ;
D/A conversion means for D/A converting said modulated signal ;
and second amplifier means for amplifying a signal from said D/A conversion means to provide a second amplified signal and for supplying the second amplified signal to the antenna ;
wherein said speech encoding means includes : band-splitting means for splitting the output of said A/D conversion means into a plurality of frequency bands , wherein the plurality of frequency bands include a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
and encoding means for encoding signals of each of said plurality of frequency bands in respective manners responsive to signal characteristics of said each of the plurality of frequency bands and for multiplexing a first signal of one of the plurality of frequency bands and a portion of a second signal of another of the plurality of frequency bands that is not in common with said first signal ;
means for finding short-term prediction residuals by a short-term prediction performed on a signal of a lowest one of said plurality of frequency bands ;
means for finding long-term prediction residuals by performing a long-term prediction on the short-term prediction residuals ;
and orthogonal transform means for orthogonal-transforming the long-term prediction residuals using a modified discrete cosine transform for the orthogonal transform with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (discrete cosine, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (said transmission) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (said transmission) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5819212A
CLAIM 1
. A signal encoding method comprising the steps of : splitting an input signal into a plurality of frequency bands ;
encoding signals of said each of the plurality of frequency bands in respective manners depending on signal characteristics of said each of the plurality of frequency bands ;
splitting the input speech signal (communication link, decoder determines concealment, speech signal) into a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
performing a short-term prediction on the signals of the second frequency band for finding short-term prediction residuals ;
performing a long-term prediction on the short-term prediction residuals for finding long-term prediction residuals ;
and orthogonal-transforming the long-term prediction residuals using a modified discrete cosine (communication link, decoder determines concealment, speech signal) transform for the orthogonal transform step with a predetermined transform length selected to be a power of 2 .

US5819212A
CLAIM 7
. A portable radio terminal apparatus including an antenna , the apparatus comprising : first amplifier means for amplifying an input speech signal to provide a first amplified signal ;
A/D conversion means for A/D converting the first amplified signal ;
speech encoding means for encoding an output of said A/D conversion means to provide an encoded signal ;
transmission path encoding means for channel-coding said encoded signal ;
modulation means for modulating an output of said transmission (first non, last non) path encoding means to provide a modulated signal ;
D/A conversion means for D/A converting said modulated signal ;
and second amplifier means for amplifying a signal from said D/A conversion means to provide a second amplified signal and for supplying the second amplified signal to the antenna ;
wherein said speech encoding means includes : band-splitting means for splitting the output of said A/D conversion means into a plurality of frequency bands , wherein the plurality of frequency bands include a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
and encoding means for encoding signals of each of said plurality of frequency bands in respective manners responsive to signal characteristics of said each of the plurality of frequency bands and for multiplexing a first signal of one of the plurality of frequency bands and a portion of a second signal of another of the plurality of frequency bands that is not in common with said first signal ;
means for finding short-term prediction residuals by a short-term prediction performed on a signal of a lowest one of said plurality of frequency bands ;
means for finding long-term prediction residuals by performing a long-term prediction on the short-term prediction residuals ;
and orthogonal transform means for orthogonal-transforming the long-term prediction residuals using a modified discrete cosine transform for the orthogonal transform with a predetermined transform length selected to be a power of 2 .

US5819212A
CLAIM 9
. The multiplexing method as claimed in claim 8 , wherein said first portion is data obtained by a linear predictive analysis of said input signal followed by quantization of parameters representing linear prediction coefficient (comfort noise) s .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (discrete cosine, speech signal) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non (said transmission) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5819212A
CLAIM 1
. A signal encoding method comprising the steps of : splitting an input signal into a plurality of frequency bands ;
encoding signals of said each of the plurality of frequency bands in respective manners depending on signal characteristics of said each of the plurality of frequency bands ;
splitting the input speech signal (communication link, decoder determines concealment, speech signal) into a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
performing a short-term prediction on the signals of the second frequency band for finding short-term prediction residuals ;
performing a long-term prediction on the short-term prediction residuals for finding long-term prediction residuals ;
and orthogonal-transforming the long-term prediction residuals using a modified discrete cosine (communication link, decoder determines concealment, speech signal) transform for the orthogonal transform step with a predetermined transform length selected to be a power of 2 .

US5819212A
CLAIM 7
. A portable radio terminal apparatus including an antenna , the apparatus comprising : first amplifier means for amplifying an input speech signal to provide a first amplified signal ;
A/D conversion means for A/D converting the first amplified signal ;
speech encoding means for encoding an output of said A/D conversion means to provide an encoded signal ;
transmission path encoding means for channel-coding said encoded signal ;
modulation means for modulating an output of said transmission (first non, last non) path encoding means to provide a modulated signal ;
D/A conversion means for D/A converting said modulated signal ;
and second amplifier means for amplifying a signal from said D/A conversion means to provide a second amplified signal and for supplying the second amplified signal to the antenna ;
wherein said speech encoding means includes : band-splitting means for splitting the output of said A/D conversion means into a plurality of frequency bands , wherein the plurality of frequency bands include a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
and encoding means for encoding signals of each of said plurality of frequency bands in respective manners responsive to signal characteristics of said each of the plurality of frequency bands and for multiplexing a first signal of one of the plurality of frequency bands and a portion of a second signal of another of the plurality of frequency bands that is not in common with said first signal ;
means for finding short-term prediction residuals by a short-term prediction performed on a signal of a lowest one of said plurality of frequency bands ;
means for finding long-term prediction residuals by performing a long-term prediction on the short-term prediction residuals ;
and orthogonal transform means for orthogonal-transforming the long-term prediction residuals using a modified discrete cosine transform for the orthogonal transform with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non (said transmission) erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (said transmission) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5819212A
CLAIM 7
. A portable radio terminal apparatus including an antenna , the apparatus comprising : first amplifier means for amplifying an input speech signal to provide a first amplified signal ;
A/D conversion means for A/D converting the first amplified signal ;
speech encoding means for encoding an output of said A/D conversion means to provide an encoded signal ;
transmission path encoding means for channel-coding said encoded signal ;
modulation means for modulating an output of said transmission (first non, last non) path encoding means to provide a modulated signal ;
D/A conversion means for D/A converting said modulated signal ;
and second amplifier means for amplifying a signal from said D/A conversion means to provide a second amplified signal and for supplying the second amplified signal to the antenna ;
wherein said speech encoding means includes : band-splitting means for splitting the output of said A/D conversion means into a plurality of frequency bands , wherein the plurality of frequency bands include a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
and encoding means for encoding signals of each of said plurality of frequency bands in respective manners responsive to signal characteristics of said each of the plurality of frequency bands and for multiplexing a first signal of one of the plurality of frequency bands and a portion of a second signal of another of the plurality of frequency bands that is not in common with said first signal ;
means for finding short-term prediction residuals by a short-term prediction performed on a signal of a lowest one of said plurality of frequency bands ;
means for finding long-term prediction residuals by performing a long-term prediction on the short-term prediction residuals ;
and orthogonal transform means for orthogonal-transforming the long-term prediction residuals using a modified discrete cosine transform for the orthogonal transform with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (discrete cosine, speech signal) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5819212A
CLAIM 1
. A signal encoding method comprising the steps of : splitting an input signal into a plurality of frequency bands ;
encoding signals of said each of the plurality of frequency bands in respective manners depending on signal characteristics of said each of the plurality of frequency bands ;
splitting the input speech signal (communication link, decoder determines concealment, speech signal) into a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
performing a short-term prediction on the signals of the second frequency band for finding short-term prediction residuals ;
performing a long-term prediction on the short-term prediction residuals for finding long-term prediction residuals ;
and orthogonal-transforming the long-term prediction residuals using a modified discrete cosine (communication link, decoder determines concealment, speech signal) transform for the orthogonal transform step with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (discrete cosine, speech signal) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5819212A
CLAIM 1
. A signal encoding method comprising the steps of : splitting an input signal into a plurality of frequency bands ;
encoding signals of said each of the plurality of frequency bands in respective manners depending on signal characteristics of said each of the plurality of frequency bands ;
splitting the input speech signal (communication link, decoder determines concealment, speech signal) into a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
performing a short-term prediction on the signals of the second frequency band for finding short-term prediction residuals ;
performing a long-term prediction on the short-term prediction residuals for finding long-term prediction residuals ;
and orthogonal-transforming the long-term prediction residuals using a modified discrete cosine (communication link, decoder determines concealment, speech signal) transform for the orthogonal transform step with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link (discrete cosine, speech signal) for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (discrete cosine, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5819212A
CLAIM 1
. A signal encoding method comprising the steps of : splitting an input signal into a plurality of frequency bands ;
encoding signals of said each of the plurality of frequency bands in respective manners depending on signal characteristics of said each of the plurality of frequency bands ;
splitting the input speech signal (communication link, decoder determines concealment, speech signal) into a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
performing a short-term prediction on the signals of the second frequency band for finding short-term prediction residuals ;
performing a long-term prediction on the short-term prediction residuals for finding long-term prediction residuals ;
and orthogonal-transforming the long-term prediction residuals using a modified discrete cosine (communication link, decoder determines concealment, speech signal) transform for the orthogonal transform step with a predetermined transform length selected to be a power of 2 .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non (said transmission) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (said transmission) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5819212A
CLAIM 7
. A portable radio terminal apparatus including an antenna , the apparatus comprising : first amplifier means for amplifying an input speech signal to provide a first amplified signal ;
A/D conversion means for A/D converting the first amplified signal ;
speech encoding means for encoding an output of said A/D conversion means to provide an encoded signal ;
transmission path encoding means for channel-coding said encoded signal ;
modulation means for modulating an output of said transmission (first non, last non) path encoding means to provide a modulated signal ;
D/A conversion means for D/A converting said modulated signal ;
and second amplifier means for amplifying a signal from said D/A conversion means to provide a second amplified signal and for supplying the second amplified signal to the antenna ;
wherein said speech encoding means includes : band-splitting means for splitting the output of said A/D conversion means into a plurality of frequency bands , wherein the plurality of frequency bands include a first frequency band and a second frequency band , said second frequency band being lower on the frequency spectrum than the first frequency band ;
and encoding means for encoding signals of each of said plurality of frequency bands in respective manners responsive to signal characteristics of said each of the plurality of frequency bands and for multiplexing a first signal of one of the plurality of frequency bands and a portion of a second signal of another of the plurality of frequency bands that is not in common with said first signal ;
means for finding short-term prediction residuals by a short-term prediction performed on a signal of a lowest one of said plurality of frequency bands ;
means for finding long-term prediction residuals by performing a long-term prediction on the short-term prediction residuals ;
and orthogonal transform means for orthogonal-transforming the long-term prediction residuals using a modified discrete cosine transform for the orthogonal transform with a predetermined transform length selected to be a power of 2 .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5890108A

Filed: 1996-10-03     Issued: 1999-03-30

Low bit-rate speech coding system and method using voicing probability determination

(Original Assignee) Voxware Inc     (Current Assignee) Voxware Inc

Suat Yeldener
US7693710B2
CLAIM 1
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (weighting function, frequency response) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (weighting function, frequency response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US5890108A
CLAIM 24
. The method of claim 23 wherein the frequency domain filtering is applied in accordance with the expression ##EQU28## where R . sub . ω (ω)=H(ω)W(ω) in which W(ω) is the weighting function (first impulse, first impulse response, impulse responses) , represented as ##EQU29## the coefficient γ is between 0 and 1 , and the frequency response (first impulse, first impulse response, impulse responses) H(ω) of the LPC filter is given by : ##EQU30## where a 78 is the coefficient of a ρth order all-pole LPC filter , γ is the weighting coefficient , and R max is the maximum value of the weighted spectral envelope .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5890108A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and detecting the presence of a fundamental frequency F 0 comprises computing the spectrum of the signal in a segment .

US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (PC mode) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5890108A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and detecting the presence of a fundamental frequency F 0 comprises computing the spectrum of the signal in a segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (PC mode) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5890108A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and detecting the presence of a fundamental frequency F 0 comprises computing the spectrum of the signal in a segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (PC mode) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (PC mode) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (weighting function, frequency response) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (weighting function, frequency response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US5890108A
CLAIM 24
. The method of claim 23 wherein the frequency domain filtering is applied in accordance with the expression ##EQU28## where R . sub . ω (ω)=H(ω)W(ω) in which W(ω) is the weighting function (first impulse, first impulse response, impulse responses) , represented as ##EQU29## the coefficient γ is between 0 and 1 , and the frequency response (first impulse, first impulse response, impulse responses) H(ω) of the LPC filter is given by : ##EQU30## where a 78 is the coefficient of a ρth order all-pole LPC filter , γ is the weighting coefficient , and R max is the maximum value of the weighted spectral envelope .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5890108A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and detecting the presence of a fundamental frequency F 0 comprises computing the spectrum of the signal in a segment .

US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (PC mode) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5890108A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and detecting the presence of a fundamental frequency F 0 comprises computing the spectrum of the signal in a segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (PC mode) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5890108A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and detecting the presence of a fundamental frequency F 0 comprises computing the spectrum of the signal in a segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (PC mode) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (PC mode) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5890108A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and detecting the presence of a fundamental frequency F 0 comprises computing the spectrum of the signal in a segment .

US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (PC mode) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5890108A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5890108A
CLAIM 22
. The method of claim 21 wherein the model of the signal is an LPC mode (frame erasure) l , the extracted data further comprises a gain parameter , and the amplitudes of said harmonics are determined using the gain parameter by sampling the LPC spectrum model at harmonics of the fundamental frequency .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6064962A

Filed: 1996-09-13     Issued: 2000-05-16

Formant emphasis method and formant emphasis filter device

(Original Assignee) Toshiba Corp     (Current Assignee) Toshiba Corp

Masahiro Oshikiri, Masami Akamine, Kimio Miseki, Akinobu Yamashita
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (current frame, pitch period) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (control device) from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6064962A
CLAIM 1
. A speech decoding device comprising : a parameter decoding device which decodes a parameter including at least one of a pitch period (current frame, decoder determines concealment, decoder concealment, pitch period) and a pitch gain of a speech signal from coded speech signal data ;
a synthesis filter which filters the speech signal using the parameter decoded by said parameter decoding device ;
a pitch emphasis device which pitch-emphasizes the speech signal filtered by said synthesis filter ;
and a control device (average pitch value) which detects a time change in at least one of the pitch period and the pitch gain decoded by said parameter decoding device , and controls a degree of pitch emphasis in said pitch emphasis device on the basis of the change .

US6064962A
CLAIM 5
. The speech decoding device according to claim 4 , wherein said input speech signal is input in units of frames further comprising : a buffer memory which stores a filter coefficient relating to a previous frame of the input speech signal ;
and a filter coefficient limiter which limits variation of the filter coefficient relating to a current frame (current frame, decoder determines concealment, decoder concealment, pitch period) which is calculated by said multiplier on the basis of the filter coefficient relating to the previous frame .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (current frame, pitch period) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6064962A
CLAIM 1
. A speech decoding device comprising : a parameter decoding device which decodes a parameter including at least one of a pitch period (current frame, decoder determines concealment, decoder concealment, pitch period) and a pitch gain of a speech signal from coded speech signal data ;
a synthesis filter which filters the speech signal using the parameter decoded by said parameter decoding device ;
a pitch emphasis device which pitch-emphasizes the speech signal filtered by said synthesis filter ;
and a control device which detects a time change in at least one of the pitch period and the pitch gain decoded by said parameter decoding device , and controls a degree of pitch emphasis in said pitch emphasis device on the basis of the change .

US6064962A
CLAIM 5
. The speech decoding device according to claim 4 , wherein said input speech signal is input in units of frames further comprising : a buffer memory which stores a filter coefficient relating to a previous frame of the input speech signal ;
and a filter coefficient limiter which limits variation of the filter coefficient relating to a current frame (current frame, decoder determines concealment, decoder concealment, pitch period) which is calculated by said multiplier on the basis of the filter coefficient relating to the previous frame .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frame, pitch period) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6064962A
CLAIM 1
. A speech decoding device comprising : a parameter decoding device which decodes a parameter including at least one of a pitch period (current frame, decoder determines concealment, decoder concealment, pitch period) and a pitch gain of a speech signal from coded speech signal data ;
a synthesis filter which filters the speech signal using the parameter decoded by said parameter decoding device ;
a pitch emphasis device which pitch-emphasizes the speech signal filtered by said synthesis filter ;
and a control device which detects a time change in at least one of the pitch period and the pitch gain decoded by said parameter decoding device , and controls a degree of pitch emphasis in said pitch emphasis device on the basis of the change .

US6064962A
CLAIM 5
. The speech decoding device according to claim 4 , wherein said input speech signal is input in units of frames further comprising : a buffer memory which stores a filter coefficient relating to a previous frame of the input speech signal ;
and a filter coefficient limiter which limits variation of the filter coefficient relating to a current frame (current frame, decoder determines concealment, decoder concealment, pitch period) which is calculated by said multiplier on the basis of the filter coefficient relating to the previous frame .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (current frame, pitch period) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6064962A
CLAIM 1
. A speech decoding device comprising : a parameter decoding device which decodes a parameter including at least one of a pitch period (current frame, decoder determines concealment, decoder concealment, pitch period) and a pitch gain of a speech signal from coded speech signal data ;
a synthesis filter which filters the speech signal using the parameter decoded by said parameter decoding device ;
a pitch emphasis device which pitch-emphasizes the speech signal filtered by said synthesis filter ;
and a control device which detects a time change in at least one of the pitch period and the pitch gain decoded by said parameter decoding device , and controls a degree of pitch emphasis in said pitch emphasis device on the basis of the change .

US6064962A
CLAIM 5
. The speech decoding device according to claim 4 , wherein said input speech signal is input in units of frames further comprising : a buffer memory which stores a filter coefficient relating to a previous frame of the input speech signal ;
and a filter coefficient limiter which limits variation of the filter coefficient relating to a current frame (current frame, decoder determines concealment, decoder concealment, pitch period) which is calculated by said multiplier on the basis of the filter coefficient relating to the previous frame .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frame, pitch period) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6064962A
CLAIM 1
. A speech decoding device comprising : a parameter decoding device which decodes a parameter including at least one of a pitch period (current frame, decoder determines concealment, decoder concealment, pitch period) and a pitch gain of a speech signal from coded speech signal data ;
a synthesis filter which filters the speech signal using the parameter decoded by said parameter decoding device ;
a pitch emphasis device which pitch-emphasizes the speech signal filtered by said synthesis filter ;
and a control device which detects a time change in at least one of the pitch period and the pitch gain decoded by said parameter decoding device , and controls a degree of pitch emphasis in said pitch emphasis device on the basis of the change .

US6064962A
CLAIM 5
. The speech decoding device according to claim 4 , wherein said input speech signal is input in units of frames further comprising : a buffer memory which stores a filter coefficient relating to a previous frame of the input speech signal ;
and a filter coefficient limiter which limits variation of the filter coefficient relating to a current frame (current frame, decoder determines concealment, decoder concealment, pitch period) which is calculated by said multiplier on the basis of the filter coefficient relating to the previous frame .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (current frame, pitch period) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (control device) from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6064962A
CLAIM 1
. A speech decoding device comprising : a parameter decoding device which decodes a parameter including at least one of a pitch period (current frame, decoder determines concealment, decoder concealment, pitch period) and a pitch gain of a speech signal from coded speech signal data ;
a synthesis filter which filters the speech signal using the parameter decoded by said parameter decoding device ;
a pitch emphasis device which pitch-emphasizes the speech signal filtered by said synthesis filter ;
and a control device (average pitch value) which detects a time change in at least one of the pitch period and the pitch gain decoded by said parameter decoding device , and controls a degree of pitch emphasis in said pitch emphasis device on the basis of the change .

US6064962A
CLAIM 5
. The speech decoding device according to claim 4 , wherein said input speech signal is input in units of frames further comprising : a buffer memory which stores a filter coefficient relating to a previous frame of the input speech signal ;
and a filter coefficient limiter which limits variation of the filter coefficient relating to a current frame (current frame, decoder determines concealment, decoder concealment, pitch period) which is calculated by said multiplier on the basis of the filter coefficient relating to the previous frame .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (current frame, pitch period) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6064962A
CLAIM 1
. A speech decoding device comprising : a parameter decoding device which decodes a parameter including at least one of a pitch period (current frame, decoder determines concealment, decoder concealment, pitch period) and a pitch gain of a speech signal from coded speech signal data ;
a synthesis filter which filters the speech signal using the parameter decoded by said parameter decoding device ;
a pitch emphasis device which pitch-emphasizes the speech signal filtered by said synthesis filter ;
and a control device which detects a time change in at least one of the pitch period and the pitch gain decoded by said parameter decoding device , and controls a degree of pitch emphasis in said pitch emphasis device on the basis of the change .

US6064962A
CLAIM 5
. The speech decoding device according to claim 4 , wherein said input speech signal is input in units of frames further comprising : a buffer memory which stores a filter coefficient relating to a previous frame of the input speech signal ;
and a filter coefficient limiter which limits variation of the filter coefficient relating to a current frame (current frame, decoder determines concealment, decoder concealment, pitch period) which is calculated by said multiplier on the basis of the filter coefficient relating to the previous frame .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frame, pitch period) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6064962A
CLAIM 1
. A speech decoding device comprising : a parameter decoding device which decodes a parameter including at least one of a pitch period (current frame, decoder determines concealment, decoder concealment, pitch period) and a pitch gain of a speech signal from coded speech signal data ;
a synthesis filter which filters the speech signal using the parameter decoded by said parameter decoding device ;
a pitch emphasis device which pitch-emphasizes the speech signal filtered by said synthesis filter ;
and a control device which detects a time change in at least one of the pitch period and the pitch gain decoded by said parameter decoding device , and controls a degree of pitch emphasis in said pitch emphasis device on the basis of the change .

US6064962A
CLAIM 5
. The speech decoding device according to claim 4 , wherein said input speech signal is input in units of frames further comprising : a buffer memory which stores a filter coefficient relating to a previous frame of the input speech signal ;
and a filter coefficient limiter which limits variation of the filter coefficient relating to a current frame (current frame, decoder determines concealment, decoder concealment, pitch period) which is calculated by said multiplier on the basis of the filter coefficient relating to the previous frame .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (current frame, pitch period) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6064962A
CLAIM 1
. A speech decoding device comprising : a parameter decoding device which decodes a parameter including at least one of a pitch period (current frame, decoder determines concealment, decoder concealment, pitch period) and a pitch gain of a speech signal from coded speech signal data ;
a synthesis filter which filters the speech signal using the parameter decoded by said parameter decoding device ;
a pitch emphasis device which pitch-emphasizes the speech signal filtered by said synthesis filter ;
and a control device which detects a time change in at least one of the pitch period and the pitch gain decoded by said parameter decoding device , and controls a degree of pitch emphasis in said pitch emphasis device on the basis of the change .

US6064962A
CLAIM 5
. The speech decoding device according to claim 4 , wherein said input speech signal is input in units of frames further comprising : a buffer memory which stores a filter coefficient relating to a previous frame of the input speech signal ;
and a filter coefficient limiter which limits variation of the filter coefficient relating to a current frame (current frame, decoder determines concealment, decoder concealment, pitch period) which is calculated by said multiplier on the basis of the filter coefficient relating to the previous frame .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frame, pitch period) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6064962A
CLAIM 1
. A speech decoding device comprising : a parameter decoding device which decodes a parameter including at least one of a pitch period (current frame, decoder determines concealment, decoder concealment, pitch period) and a pitch gain of a speech signal from coded speech signal data ;
a synthesis filter which filters the speech signal using the parameter decoded by said parameter decoding device ;
a pitch emphasis device which pitch-emphasizes the speech signal filtered by said synthesis filter ;
and a control device which detects a time change in at least one of the pitch period and the pitch gain decoded by said parameter decoding device , and controls a degree of pitch emphasis in said pitch emphasis device on the basis of the change .

US6064962A
CLAIM 5
. The speech decoding device according to claim 4 , wherein said input speech signal is input in units of frames further comprising : a buffer memory which stores a filter coefficient relating to a previous frame of the input speech signal ;
and a filter coefficient limiter which limits variation of the filter coefficient relating to a current frame (current frame, decoder determines concealment, decoder concealment, pitch period) which is calculated by said multiplier on the basis of the filter coefficient relating to the previous frame .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH1069297A

Filed: 1996-08-26     Issued: 1998-03-10

音声符号化装置

(Original Assignee) Nec Corp; 日本電気株式会社     

Kazunori Ozawa, 小澤一範
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

JPH1069297A
CLAIM 2
【請求項2】音源量子化部において、複数個のパルスの 振幅もしくは極性をまとめて量子化するためのコードブ ックを有すること (maximum amplitude) を特徴とする請求項1に記載の音声符 号化装置。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (切替えること) and the first non erased frame received after frame erasure is encoded as active speech .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

JPH1069297A
CLAIM 5
【請求項5】入力した音声信号から特徴量を求め前記特 徴量から複数種のモードを判別しモード情報を出力する モード判別回路と、前記モード情報に応じて前記第1の 音源量子化部を用いるか前記第2の音源量子化部を用い るかを切替えること (comfort noise) を特徴とする請求項3または4に記 載の音声符号化装置。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

JPH1069297A
CLAIM 2
【請求項2】音源量子化部において、複数個のパルスの 振幅もしくは極性をまとめて量子化するためのコードブ ックを有すること (maximum amplitude) を特徴とする請求項1に記載の音声符 号化装置。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

JPH1069297A
CLAIM 2
【請求項2】音源量子化部において、複数個のパルスの 振幅もしくは極性をまとめて量子化するためのコードブ ックを有すること (maximum amplitude) を特徴とする請求項1に記載の音声符 号化装置。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (切替えること) and the first non erased frame received after frame erasure is encoded as active speech .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

JPH1069297A
CLAIM 5
【請求項5】入力した音声信号から特徴量を求め前記特 徴量から複数種のモードを判別しモード情報を出力する モード判別回路と、前記モード情報に応じて前記第1の 音源量子化部を用いるか前記第2の音源量子化部を用い るかを切替えること (comfort noise) を特徴とする請求項3または4に記 載の音声符号化装置。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

JPH1069297A
CLAIM 2
【請求項2】音源量子化部において、複数個のパルスの 振幅もしくは極性をまとめて量子化するためのコードブ ックを有すること (maximum amplitude) を特徴とする請求項1に記載の音声符 号化装置。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JPH1069297A
CLAIM 1
【請求項1】入力した音声信号 (sound signal, speech signal) からスペクトルパラメー タを求めて量子化するスペクトルパラメータ計算部と、 前記音声信号の音源信号が個数Mの非零のパルスから構 成され、前記スペクトルパラメータを用いて前記パルス の位置を探索する際にMよりも小さい個数ごとにゲイン を変化させながらパルスの位置を探索し出力する音源量 子化部とを有する音声符号化装置。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6092041A

Filed: 1996-08-22     Issued: 2000-07-18

System and method of encoding and decoding a layered bitstream by re-applying psychoacoustic analysis in the decoder

(Original Assignee) Motorola Solutions Inc     (Current Assignee) Google Technology Holdings LLC

Davis Pan, Otto Schnurr
US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (programmable gate array) and the first non erased frame received after frame erasure is encoded as active speech .
US6092041A
CLAIM 4
. The method of claim 3 wherein the method is implemented by a computer program for providing scalable bitrate audio compression parameters , wherein the computer program is implemented/embodied in a tangible medium of at least one of : A) a memory ;
B) an application specific integrated circuit ;
C) a digital signal processor ;
and D) a field programmable gate array (comfort noise) .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (programmable gate array) and the first non erased frame received after frame erasure is encoded as active speech .
US6092041A
CLAIM 4
. The method of claim 3 wherein the method is implemented by a computer program for providing scalable bitrate audio compression parameters , wherein the computer program is implemented/embodied in a tangible medium of at least one of : A) a memory ;
B) an application specific integrated circuit ;
C) a digital signal processor ;
and D) a field programmable gate array (comfort noise) .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH1039898A

Filed: 1996-07-22     Issued: 1998-02-13

音声信号伝送方法及び音声符号復号化システム

(Original Assignee) Nec Corp; 日本電気株式会社     

Toshihiro Hayata, 利浩 早田
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (期間中) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (入力音) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JPH1039898A
CLAIM 1
【請求項1】 送信側で入力音 (first impulse) 声信号を符号化して符号 化データとして受信側に伝送し、前記受信側では前記符 号化データを復号化して出力音声信号として出力し、前 記送信側で無音区間を検出したときには無音区間での入 力音声信号を符号化して背景雑音更新信号とし、前記背 景雑音更新信号を前記送信側から前記受信側に送信した 後は前記送信側が所定の期間送信を停止し、前記所定の 期間中 (pitch period) は前記受信側は既に受信した背景雑音更新信号に 基づいて背景雑音を生成して出力音声信号として出力す る音声信号伝送方法において、 前記無音区間での入力音声信号から量子化スペクトルを 求めるとともに非量子化スペクトル包絡と量子化スペク トル包絡とを算出し、前記非量子化スペクトル包絡と前 記量子化スペクトル包絡の差が所定のしきい値より大き いときには前記量子化スペクトルを変化させ、変化後の 量子化スペクトルに基づいて前記背景雑音更新信号を生 成することを特徴とする音声信号伝送方法。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (期間中) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH1039898A
CLAIM 1
【請求項1】 送信側で入力音声信号を符号化して符号 化データとして受信側に伝送し、前記受信側では前記符 号化データを復号化して出力音声信号として出力し、前 記送信側で無音区間を検出したときには無音区間での入 力音声信号を符号化して背景雑音更新信号とし、前記背 景雑音更新信号を前記送信側から前記受信側に送信した 後は前記送信側が所定の期間送信を停止し、前記所定の 期間中 (pitch period) は前記受信側は既に受信した背景雑音更新信号に 基づいて背景雑音を生成して出力音声信号として出力す る音声信号伝送方法において、 前記無音区間での入力音声信号から量子化スペクトルを 求めるとともに非量子化スペクトル包絡と量子化スペク トル包絡とを算出し、前記非量子化スペクトル包絡と前 記量子化スペクトル包絡の差が所定のしきい値より大き いときには前記量子化スペクトルを変化させ、変化後の 量子化スペクトルに基づいて前記背景雑音更新信号を生 成することを特徴とする音声信号伝送方法。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (の音声信号, 音声復号, 前記送信) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JPH1039898A
CLAIM 1
【請求項1】 送信側で入力音声信号を符号化して符号 化データとして受信側に伝送し、前記受信側では前記符 号化データを復号化して出力音声信号として出力し、前 記送信側で無音区間を検出したときには無音区間での入 力音声信号を符号化して背景雑音更新信号とし、前記背 景雑音更新信号を前記送信 (speech signal) 側から前記受信側に送信した 後は前記送信側が所定の期間送信を停止し、前記所定の 期間中は前記受信側は既に受信した背景雑音更新信号に 基づいて背景雑音を生成して出力音声信号として出力す る音声信号伝送方法において、 前記無音区間での入力音声信号から量子化スペクトルを 求めるとともに非量子化スペクトル包絡と量子化スペク トル包絡とを算出し、前記非量子化スペクトル包絡と前 記量子化スペクトル包絡の差が所定のしきい値より大き いときには前記量子化スペクトルを変化させ、変化後の 量子化スペクトルに基づいて前記背景雑音更新信号を生 成することを特徴とする音声信号伝送方法。

JPH1039898A
CLAIM 2
【請求項2】 有音区間での前記入力音声信号を符号化 する際に使用する符号帳と前記無音区間で前記入力音声 信号を符号化する際に使用する符号帳とが異なる、請求 項1に記載の音声信号 (speech signal) 伝送方法。

JPH1039898A
CLAIM 3
【請求項3】 音声符号化装置と音声復号 (speech signal) 化装置とを有 し、背景雑音を生成するVOX処理を実行する音声符号 復号化システムにおいて、 前記音声符号化装置に、前記音声符号化装置への入力信 号での非量子化スペクトル包絡と量子化スペクトル包絡 の差を定量的に求めるスペクトル包絡比較手段と、前記 差に応じて前記量子化スペクトル包絡を変化させるスペ クトル包絡変更手段とが設けられ、 前記音声符号化装置は、背景雑音に関する符号化処理を 行う際に前記スペクトル包絡変更手段によって変化した 量子化スペクトル包絡を使用するとともに、その量子化 スペクトル包絡の変化に関するスペクトル変更情報を前 記音声復号化装置側に送信し、 前記音声復号化装置に、受信した前記スペクトル変更情 報を格納するスペクトル変更情報記憶手段と、前記スペ クトル変更情報記憶手段に格納されたスペクトル変更情 報に基づいて、受信した量子化スペクトル包絡を変化さ せる変化済スペクトル係数算出手段とが設けられ、 前記音声符号化装置が背景雑音を生成する際には、前記 変化済スペクトル係数算出手段が出力する量子化スペク トルが使用されることを特徴とする音声符号復号化シス テム。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (の音声信号, 音声復号, 前記送信) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JPH1039898A
CLAIM 1
【請求項1】 送信側で入力音声信号を符号化して符号 化データとして受信側に伝送し、前記受信側では前記符 号化データを復号化して出力音声信号として出力し、前 記送信側で無音区間を検出したときには無音区間での入 力音声信号を符号化して背景雑音更新信号とし、前記背 景雑音更新信号を前記送信 (speech signal) 側から前記受信側に送信した 後は前記送信側が所定の期間送信を停止し、前記所定の 期間中は前記受信側は既に受信した背景雑音更新信号に 基づいて背景雑音を生成して出力音声信号として出力す る音声信号伝送方法において、 前記無音区間での入力音声信号から量子化スペクトルを 求めるとともに非量子化スペクトル包絡と量子化スペク トル包絡とを算出し、前記非量子化スペクトル包絡と前 記量子化スペクトル包絡の差が所定のしきい値より大き いときには前記量子化スペクトルを変化させ、変化後の 量子化スペクトルに基づいて前記背景雑音更新信号を生 成することを特徴とする音声信号伝送方法。

JPH1039898A
CLAIM 2
【請求項2】 有音区間での前記入力音声信号を符号化 する際に使用する符号帳と前記無音区間で前記入力音声 信号を符号化する際に使用する符号帳とが異なる、請求 項1に記載の音声信号 (speech signal) 伝送方法。

JPH1039898A
CLAIM 3
【請求項3】 音声符号化装置と音声復号 (speech signal) 化装置とを有 し、背景雑音を生成するVOX処理を実行する音声符号 復号化システムにおいて、 前記音声符号化装置に、前記音声符号化装置への入力信 号での非量子化スペクトル包絡と量子化スペクトル包絡 の差を定量的に求めるスペクトル包絡比較手段と、前記 差に応じて前記量子化スペクトル包絡を変化させるスペ クトル包絡変更手段とが設けられ、 前記音声符号化装置は、背景雑音に関する符号化処理を 行う際に前記スペクトル包絡変更手段によって変化した 量子化スペクトル包絡を使用するとともに、その量子化 スペクトル包絡の変化に関するスペクトル変更情報を前 記音声復号化装置側に送信し、 前記音声復号化装置に、受信した前記スペクトル変更情 報を格納するスペクトル変更情報記憶手段と、前記スペ クトル変更情報記憶手段に格納されたスペクトル変更情 報に基づいて、受信した量子化スペクトル包絡を変化さ せる変化済スペクトル係数算出手段とが設けられ、 前記音声符号化装置が背景雑音を生成する際には、前記 変化済スペクトル係数算出手段が出力する量子化スペク トルが使用されることを特徴とする音声符号復号化シス テム。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (の音声信号, 音声復号, 前記送信) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH1039898A
CLAIM 1
【請求項1】 送信側で入力音声信号を符号化して符号 化データとして受信側に伝送し、前記受信側では前記符 号化データを復号化して出力音声信号として出力し、前 記送信側で無音区間を検出したときには無音区間での入 力音声信号を符号化して背景雑音更新信号とし、前記背 景雑音更新信号を前記送信 (speech signal) 側から前記受信側に送信した 後は前記送信側が所定の期間送信を停止し、前記所定の 期間中は前記受信側は既に受信した背景雑音更新信号に 基づいて背景雑音を生成して出力音声信号として出力す る音声信号伝送方法において、 前記無音区間での入力音声信号から量子化スペクトルを 求めるとともに非量子化スペクトル包絡と量子化スペク トル包絡とを算出し、前記非量子化スペクトル包絡と前 記量子化スペクトル包絡の差が所定のしきい値より大き いときには前記量子化スペクトルを変化させ、変化後の 量子化スペクトルに基づいて前記背景雑音更新信号を生 成することを特徴とする音声信号伝送方法。

JPH1039898A
CLAIM 2
【請求項2】 有音区間での前記入力音声信号を符号化 する際に使用する符号帳と前記無音区間で前記入力音声 信号を符号化する際に使用する符号帳とが異なる、請求 項1に記載の音声信号 (speech signal) 伝送方法。

JPH1039898A
CLAIM 3
【請求項3】 音声符号化装置と音声復号 (speech signal) 化装置とを有 し、背景雑音を生成するVOX処理を実行する音声符号 復号化システムにおいて、 前記音声符号化装置に、前記音声符号化装置への入力信 号での非量子化スペクトル包絡と量子化スペクトル包絡 の差を定量的に求めるスペクトル包絡比較手段と、前記 差に応じて前記量子化スペクトル包絡を変化させるスペ クトル包絡変更手段とが設けられ、 前記音声符号化装置は、背景雑音に関する符号化処理を 行う際に前記スペクトル包絡変更手段によって変化した 量子化スペクトル包絡を使用するとともに、その量子化 スペクトル包絡の変化に関するスペクトル変更情報を前 記音声復号化装置側に送信し、 前記音声復号化装置に、受信した前記スペクトル変更情 報を格納するスペクトル変更情報記憶手段と、前記スペ クトル変更情報記憶手段に格納されたスペクトル変更情 報に基づいて、受信した量子化スペクトル包絡を変化さ せる変化済スペクトル係数算出手段とが設けられ、 前記音声符号化装置が背景雑音を生成する際には、前記 変化済スペクトル係数算出手段が出力する量子化スペク トルが使用されることを特徴とする音声符号復号化シス テム。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (期間中) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH1039898A
CLAIM 1
【請求項1】 送信側で入力音声信号を符号化して符号 化データとして受信側に伝送し、前記受信側では前記符 号化データを復号化して出力音声信号として出力し、前 記送信側で無音区間を検出したときには無音区間での入 力音声信号を符号化して背景雑音更新信号とし、前記背 景雑音更新信号を前記送信側から前記受信側に送信した 後は前記送信側が所定の期間送信を停止し、前記所定の 期間中 (pitch period) は前記受信側は既に受信した背景雑音更新信号に 基づいて背景雑音を生成して出力音声信号として出力す る音声信号伝送方法において、 前記無音区間での入力音声信号から量子化スペクトルを 求めるとともに非量子化スペクトル包絡と量子化スペク トル包絡とを算出し、前記非量子化スペクトル包絡と前 記量子化スペクトル包絡の差が所定のしきい値より大き いときには前記量子化スペクトルを変化させ、変化後の 量子化スペクトルに基づいて前記背景雑音更新信号を生 成することを特徴とする音声信号伝送方法。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (期間中) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (入力音) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JPH1039898A
CLAIM 1
【請求項1】 送信側で入力音 (first impulse) 声信号を符号化して符号 化データとして受信側に伝送し、前記受信側では前記符 号化データを復号化して出力音声信号として出力し、前 記送信側で無音区間を検出したときには無音区間での入 力音声信号を符号化して背景雑音更新信号とし、前記背 景雑音更新信号を前記送信側から前記受信側に送信した 後は前記送信側が所定の期間送信を停止し、前記所定の 期間中 (pitch period) は前記受信側は既に受信した背景雑音更新信号に 基づいて背景雑音を生成して出力音声信号として出力す る音声信号伝送方法において、 前記無音区間での入力音声信号から量子化スペクトルを 求めるとともに非量子化スペクトル包絡と量子化スペク トル包絡とを算出し、前記非量子化スペクトル包絡と前 記量子化スペクトル包絡の差が所定のしきい値より大き いときには前記量子化スペクトルを変化させ、変化後の 量子化スペクトルに基づいて前記背景雑音更新信号を生 成することを特徴とする音声信号伝送方法。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (期間中) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH1039898A
CLAIM 1
【請求項1】 送信側で入力音声信号を符号化して符号 化データとして受信側に伝送し、前記受信側では前記符 号化データを復号化して出力音声信号として出力し、前 記送信側で無音区間を検出したときには無音区間での入 力音声信号を符号化して背景雑音更新信号とし、前記背 景雑音更新信号を前記送信側から前記受信側に送信した 後は前記送信側が所定の期間送信を停止し、前記所定の 期間中 (pitch period) は前記受信側は既に受信した背景雑音更新信号に 基づいて背景雑音を生成して出力音声信号として出力す る音声信号伝送方法において、 前記無音区間での入力音声信号から量子化スペクトルを 求めるとともに非量子化スペクトル包絡と量子化スペク トル包絡とを算出し、前記非量子化スペクトル包絡と前 記量子化スペクトル包絡の差が所定のしきい値より大き いときには前記量子化スペクトルを変化させ、変化後の 量子化スペクトルに基づいて前記背景雑音更新信号を生 成することを特徴とする音声信号伝送方法。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (の音声信号, 音声復号, 前記送信) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH1039898A
CLAIM 1
【請求項1】 送信側で入力音声信号を符号化して符号 化データとして受信側に伝送し、前記受信側では前記符 号化データを復号化して出力音声信号として出力し、前 記送信側で無音区間を検出したときには無音区間での入 力音声信号を符号化して背景雑音更新信号とし、前記背 景雑音更新信号を前記送信 (speech signal) 側から前記受信側に送信した 後は前記送信側が所定の期間送信を停止し、前記所定の 期間中は前記受信側は既に受信した背景雑音更新信号に 基づいて背景雑音を生成して出力音声信号として出力す る音声信号伝送方法において、 前記無音区間での入力音声信号から量子化スペクトルを 求めるとともに非量子化スペクトル包絡と量子化スペク トル包絡とを算出し、前記非量子化スペクトル包絡と前 記量子化スペクトル包絡の差が所定のしきい値より大き いときには前記量子化スペクトルを変化させ、変化後の 量子化スペクトルに基づいて前記背景雑音更新信号を生 成することを特徴とする音声信号伝送方法。

JPH1039898A
CLAIM 2
【請求項2】 有音区間での前記入力音声信号を符号化 する際に使用する符号帳と前記無音区間で前記入力音声 信号を符号化する際に使用する符号帳とが異なる、請求 項1に記載の音声信号 (speech signal) 伝送方法。

JPH1039898A
CLAIM 3
【請求項3】 音声符号化装置と音声復号 (speech signal) 化装置とを有 し、背景雑音を生成するVOX処理を実行する音声符号 復号化システムにおいて、 前記音声符号化装置に、前記音声符号化装置への入力信 号での非量子化スペクトル包絡と量子化スペクトル包絡 の差を定量的に求めるスペクトル包絡比較手段と、前記 差に応じて前記量子化スペクトル包絡を変化させるスペ クトル包絡変更手段とが設けられ、 前記音声符号化装置は、背景雑音に関する符号化処理を 行う際に前記スペクトル包絡変更手段によって変化した 量子化スペクトル包絡を使用するとともに、その量子化 スペクトル包絡の変化に関するスペクトル変更情報を前 記音声復号化装置側に送信し、 前記音声復号化装置に、受信した前記スペクトル変更情 報を格納するスペクトル変更情報記憶手段と、前記スペ クトル変更情報記憶手段に格納されたスペクトル変更情 報に基づいて、受信した量子化スペクトル包絡を変化さ せる変化済スペクトル係数算出手段とが設けられ、 前記音声符号化装置が背景雑音を生成する際には、前記 変化済スペクトル係数算出手段が出力する量子化スペク トルが使用されることを特徴とする音声符号復号化シス テム。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (の音声信号, 音声復号, 前記送信) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JPH1039898A
CLAIM 1
【請求項1】 送信側で入力音声信号を符号化して符号 化データとして受信側に伝送し、前記受信側では前記符 号化データを復号化して出力音声信号として出力し、前 記送信側で無音区間を検出したときには無音区間での入 力音声信号を符号化して背景雑音更新信号とし、前記背 景雑音更新信号を前記送信 (speech signal) 側から前記受信側に送信した 後は前記送信側が所定の期間送信を停止し、前記所定の 期間中は前記受信側は既に受信した背景雑音更新信号に 基づいて背景雑音を生成して出力音声信号として出力す る音声信号伝送方法において、 前記無音区間での入力音声信号から量子化スペクトルを 求めるとともに非量子化スペクトル包絡と量子化スペク トル包絡とを算出し、前記非量子化スペクトル包絡と前 記量子化スペクトル包絡の差が所定のしきい値より大き いときには前記量子化スペクトルを変化させ、変化後の 量子化スペクトルに基づいて前記背景雑音更新信号を生 成することを特徴とする音声信号伝送方法。

JPH1039898A
CLAIM 2
【請求項2】 有音区間での前記入力音声信号を符号化 する際に使用する符号帳と前記無音区間で前記入力音声 信号を符号化する際に使用する符号帳とが異なる、請求 項1に記載の音声信号 (speech signal) 伝送方法。

JPH1039898A
CLAIM 3
【請求項3】 音声符号化装置と音声復号 (speech signal) 化装置とを有 し、背景雑音を生成するVOX処理を実行する音声符号 復号化システムにおいて、 前記音声符号化装置に、前記音声符号化装置への入力信 号での非量子化スペクトル包絡と量子化スペクトル包絡 の差を定量的に求めるスペクトル包絡比較手段と、前記 差に応じて前記量子化スペクトル包絡を変化させるスペ クトル包絡変更手段とが設けられ、 前記音声符号化装置は、背景雑音に関する符号化処理を 行う際に前記スペクトル包絡変更手段によって変化した 量子化スペクトル包絡を使用するとともに、その量子化 スペクトル包絡の変化に関するスペクトル変更情報を前 記音声復号化装置側に送信し、 前記音声復号化装置に、受信した前記スペクトル変更情 報を格納するスペクトル変更情報記憶手段と、前記スペ クトル変更情報記憶手段に格納されたスペクトル変更情 報に基づいて、受信した量子化スペクトル包絡を変化さ せる変化済スペクトル係数算出手段とが設けられ、 前記音声符号化装置が背景雑音を生成する際には、前記 変化済スペクトル係数算出手段が出力する量子化スペク トルが使用されることを特徴とする音声符号復号化シス テム。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (の音声信号, 音声復号, 前記送信) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH1039898A
CLAIM 1
【請求項1】 送信側で入力音声信号を符号化して符号 化データとして受信側に伝送し、前記受信側では前記符 号化データを復号化して出力音声信号として出力し、前 記送信側で無音区間を検出したときには無音区間での入 力音声信号を符号化して背景雑音更新信号とし、前記背 景雑音更新信号を前記送信 (speech signal) 側から前記受信側に送信した 後は前記送信側が所定の期間送信を停止し、前記所定の 期間中は前記受信側は既に受信した背景雑音更新信号に 基づいて背景雑音を生成して出力音声信号として出力す る音声信号伝送方法において、 前記無音区間での入力音声信号から量子化スペクトルを 求めるとともに非量子化スペクトル包絡と量子化スペク トル包絡とを算出し、前記非量子化スペクトル包絡と前 記量子化スペクトル包絡の差が所定のしきい値より大き いときには前記量子化スペクトルを変化させ、変化後の 量子化スペクトルに基づいて前記背景雑音更新信号を生 成することを特徴とする音声信号伝送方法。

JPH1039898A
CLAIM 2
【請求項2】 有音区間での前記入力音声信号を符号化 する際に使用する符号帳と前記無音区間で前記入力音声 信号を符号化する際に使用する符号帳とが異なる、請求 項1に記載の音声信号 (speech signal) 伝送方法。

JPH1039898A
CLAIM 3
【請求項3】 音声符号化装置と音声復号 (speech signal) 化装置とを有 し、背景雑音を生成するVOX処理を実行する音声符号 復号化システムにおいて、 前記音声符号化装置に、前記音声符号化装置への入力信 号での非量子化スペクトル包絡と量子化スペクトル包絡 の差を定量的に求めるスペクトル包絡比較手段と、前記 差に応じて前記量子化スペクトル包絡を変化させるスペ クトル包絡変更手段とが設けられ、 前記音声符号化装置は、背景雑音に関する符号化処理を 行う際に前記スペクトル包絡変更手段によって変化した 量子化スペクトル包絡を使用するとともに、その量子化 スペクトル包絡の変化に関するスペクトル変更情報を前 記音声復号化装置側に送信し、 前記音声復号化装置に、受信した前記スペクトル変更情 報を格納するスペクトル変更情報記憶手段と、前記スペ クトル変更情報記憶手段に格納されたスペクトル変更情 報に基づいて、受信した量子化スペクトル包絡を変化さ せる変化済スペクトル係数算出手段とが設けられ、 前記音声符号化装置が背景雑音を生成する際には、前記 変化済スペクトル係数算出手段が出力する量子化スペク トルが使用されることを特徴とする音声符号復号化シス テム。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (期間中) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH1039898A
CLAIM 1
【請求項1】 送信側で入力音声信号を符号化して符号 化データとして受信側に伝送し、前記受信側では前記符 号化データを復号化して出力音声信号として出力し、前 記送信側で無音区間を検出したときには無音区間での入 力音声信号を符号化して背景雑音更新信号とし、前記背 景雑音更新信号を前記送信側から前記受信側に送信した 後は前記送信側が所定の期間送信を停止し、前記所定の 期間中 (pitch period) は前記受信側は既に受信した背景雑音更新信号に 基づいて背景雑音を生成して出力音声信号として出力す る音声信号伝送方法において、 前記無音区間での入力音声信号から量子化スペクトルを 求めるとともに非量子化スペクトル包絡と量子化スペク トル包絡とを算出し、前記非量子化スペクトル包絡と前 記量子化スペクトル包絡の差が所定のしきい値より大き いときには前記量子化スペクトルを変化させ、変化後の 量子化スペクトルに基づいて前記背景雑音更新信号を生 成することを特徴とする音声信号伝送方法。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (の音声信号, 音声復号, 前記送信) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH1039898A
CLAIM 1
【請求項1】 送信側で入力音声信号を符号化して符号 化データとして受信側に伝送し、前記受信側では前記符 号化データを復号化して出力音声信号として出力し、前 記送信側で無音区間を検出したときには無音区間での入 力音声信号を符号化して背景雑音更新信号とし、前記背 景雑音更新信号を前記送信 (speech signal) 側から前記受信側に送信した 後は前記送信側が所定の期間送信を停止し、前記所定の 期間中は前記受信側は既に受信した背景雑音更新信号に 基づいて背景雑音を生成して出力音声信号として出力す る音声信号伝送方法において、 前記無音区間での入力音声信号から量子化スペクトルを 求めるとともに非量子化スペクトル包絡と量子化スペク トル包絡とを算出し、前記非量子化スペクトル包絡と前 記量子化スペクトル包絡の差が所定のしきい値より大き いときには前記量子化スペクトルを変化させ、変化後の 量子化スペクトルに基づいて前記背景雑音更新信号を生 成することを特徴とする音声信号伝送方法。

JPH1039898A
CLAIM 2
【請求項2】 有音区間での前記入力音声信号を符号化 する際に使用する符号帳と前記無音区間で前記入力音声 信号を符号化する際に使用する符号帳とが異なる、請求 項1に記載の音声信号 (speech signal) 伝送方法。

JPH1039898A
CLAIM 3
【請求項3】 音声符号化装置と音声復号 (speech signal) 化装置とを有 し、背景雑音を生成するVOX処理を実行する音声符号 復号化システムにおいて、 前記音声符号化装置に、前記音声符号化装置への入力信 号での非量子化スペクトル包絡と量子化スペクトル包絡 の差を定量的に求めるスペクトル包絡比較手段と、前記 差に応じて前記量子化スペクトル包絡を変化させるスペ クトル包絡変更手段とが設けられ、 前記音声符号化装置は、背景雑音に関する符号化処理を 行う際に前記スペクトル包絡変更手段によって変化した 量子化スペクトル包絡を使用するとともに、その量子化 スペクトル包絡の変化に関するスペクトル変更情報を前 記音声復号化装置側に送信し、 前記音声復号化装置に、受信した前記スペクトル変更情 報を格納するスペクトル変更情報記憶手段と、前記スペ クトル変更情報記憶手段に格納されたスペクトル変更情 報に基づいて、受信した量子化スペクトル包絡を変化さ せる変化済スペクトル係数算出手段とが設けられ、 前記音声符号化装置が背景雑音を生成する際には、前記 変化済スペクトル係数算出手段が出力する量子化スペク トルが使用されることを特徴とする音声符号復号化シス テム。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5809456A

Filed: 1996-06-27     Issued: 1998-09-15

Voiced speech coding and decoding using phase-adapted single excitation

(Original Assignee) Alcatel Lucent Italia SpA     (Current Assignee) Alcatel Lucent Italia SpA

Silvio Cucchi, Marco Fratti
US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (time t) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5809456A
CLAIM 9
. A method of decoding an encoded sampled voiced speech signal , the method comprising the steps of : a) receiving a set of linear predictive coding (LPC) filter parameters ;
b) receiving an excitation waveform in terms of excitation parameters , said excitation parameters including amplitude , phase and position information ;
c) performing an inverse transform to obtain an unpositioned excitation waveform ;
d) receiving a length of a prototype waveform ;
e) translating in time t (first non) he unpositioned excitation waveform to the received position and adjusting its amplitude to the received amplitude to provide an unperiodicized excitation waveform ;
f) periodicizing said unperiodicized excitation waveform according to the prototype waveform length ;
g) calculating the prototype waveform from the LPC filter parameters and the periodicized excitation waveform ;
h) receiving interpolation parameters for prototype waveform interpolation ;
and i) reconstructing said sampled voiced speech signal by performing prototype waveform interpolation using the interpolation parameters .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (time t) erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5809456A
CLAIM 9
. A method of decoding an encoded sampled voiced speech signal , the method comprising the steps of : a) receiving a set of linear predictive coding (LPC) filter parameters ;
b) receiving an excitation waveform in terms of excitation parameters , said excitation parameters including amplitude , phase and position information ;
c) performing an inverse transform to obtain an unpositioned excitation waveform ;
d) receiving a length of a prototype waveform ;
e) translating in time t (first non) he unpositioned excitation waveform to the received position and adjusting its amplitude to the received amplitude to provide an unperiodicized excitation waveform ;
f) periodicizing said unperiodicized excitation waveform according to the prototype waveform length ;
g) calculating the prototype waveform from the LPC filter parameters and the periodicized excitation waveform ;
h) receiving interpolation parameters for prototype waveform interpolation ;
and i) reconstructing said sampled voiced speech signal by performing prototype waveform interpolation using the interpolation parameters .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (time t) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (filter output) and the first non erased frame received after frame erasure is encoded as active speech .
US5809456A
CLAIM 1
. A method of coding a sampled voiced speech signal , said voiced speech signal containing a repetition of a prototype waveform , the method comprising the steps of : a) taking a segment of said sampled voiced speech signal the segment having a length equal to the length of the prototype waveform , and extending the sampled voiced speech signal using the period of the prototype waveform ;
b) calculating a series of autocorrelation coefficients of said extended sampled voiced speech signal segment ;
c) calculating , from said series of autocorrelation coefficients , a series of linear predictive coding (LPC) coefficients , relative to a synthesis filter the synthesis filter output (comfort noise) ting a synthesized waveform when provided as input an excitation waveform ;
d) determining the excitation waveform of said synthesis filter in terms of the LPC coefficients and a single phase-adapted pulse , the single pulse phase-adapted so that the signal coming out from said synthesis filter is minimally distorted with respect to said sampled speech signal segment ;
and e) quantizing said series of LPC coefficients and said excitation waveform .

US5809456A
CLAIM 9
. A method of decoding an encoded sampled voiced speech signal , the method comprising the steps of : a) receiving a set of linear predictive coding (LPC) filter parameters ;
b) receiving an excitation waveform in terms of excitation parameters , said excitation parameters including amplitude , phase and position information ;
c) performing an inverse transform to obtain an unpositioned excitation waveform ;
d) receiving a length of a prototype waveform ;
e) translating in time t (first non) he unpositioned excitation waveform to the received position and adjusting its amplitude to the received amplitude to provide an unperiodicized excitation waveform ;
f) periodicizing said unperiodicized excitation waveform according to the prototype waveform length ;
g) calculating the prototype waveform from the LPC filter parameters and the periodicized excitation waveform ;
h) receiving interpolation parameters for prototype waveform interpolation ;
and i) reconstructing said sampled voiced speech signal by performing prototype waveform interpolation using the interpolation parameters .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (autocorrelation coefficients) of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5809456A
CLAIM 1
. A method of coding a sampled voiced speech signal , said voiced speech signal containing a repetition of a prototype waveform , the method comprising the steps of : a) taking a segment of said sampled voiced speech signal the segment having a length equal to the length of the prototype waveform , and extending the sampled voiced speech signal using the period of the prototype waveform ;
b) calculating a series of autocorrelation coefficients (LP filter, LP filter excitation signal) of said extended sampled voiced speech signal segment ;
c) calculating , from said series of autocorrelation coefficients , a series of linear predictive coding (LPC) coefficients , relative to a synthesis filter the synthesis filter outputting a synthesized waveform when provided as input an excitation waveform ;
d) determining the excitation waveform of said synthesis filter in terms of the LPC coefficients and a single phase-adapted pulse , the single pulse phase-adapted so that the signal coming out from said synthesis filter is minimally distorted with respect to said sampled speech signal segment ;
and e) quantizing said series of LPC coefficients and said excitation waveform .

US5809456A
CLAIM 9
. A method of decoding an encoded sampled voiced speech signal , the method comprising the steps of : a) receiving a set of linear predictive coding (LPC) filter parameters ;
b) receiving an excitation waveform in terms of excitation parameters , said excitation parameters including amplitude , phase and position information ;
c) performing an inverse transform to obtain an unpositioned excitation waveform ;
d) receiving a length of a prototype waveform ;
e) translating in time t (first non) he unpositioned excitation waveform to the received position and adjusting its amplitude to the received amplitude to provide an unperiodicized excitation waveform ;
f) periodicizing said unperiodicized excitation waveform according to the prototype waveform length ;
g) calculating the prototype waveform from the LPC filter parameters and the periodicized excitation waveform ;
h) receiving interpolation parameters for prototype waveform interpolation ;
and i) reconstructing said sampled voiced speech signal by performing prototype waveform interpolation using the interpolation parameters .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (autocorrelation coefficients) excitation signal produced in the decoder during the received first non (time t) erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (when p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5809456A
CLAIM 1
. A method of coding a sampled voiced speech signal , said voiced speech signal containing a repetition of a prototype waveform , the method comprising the steps of : a) taking a segment of said sampled voiced speech signal the segment having a length equal to the length of the prototype waveform , and extending the sampled voiced speech signal using the period of the prototype waveform ;
b) calculating a series of autocorrelation coefficients (LP filter, LP filter excitation signal) of said extended sampled voiced speech signal segment ;
c) calculating , from said series of autocorrelation coefficients , a series of linear predictive coding (LPC) coefficients , relative to a synthesis filter the synthesis filter outputting a synthesized waveform when p (E q) rovided as input an excitation waveform ;
d) determining the excitation waveform of said synthesis filter in terms of the LPC coefficients and a single phase-adapted pulse , the single pulse phase-adapted so that the signal coming out from said synthesis filter is minimally distorted with respect to said sampled speech signal segment ;
and e) quantizing said series of LPC coefficients and said excitation waveform .

US5809456A
CLAIM 9
. A method of decoding an encoded sampled voiced speech signal , the method comprising the steps of : a) receiving a set of linear predictive coding (LPC) filter parameters ;
b) receiving an excitation waveform in terms of excitation parameters , said excitation parameters including amplitude , phase and position information ;
c) performing an inverse transform to obtain an unpositioned excitation waveform ;
d) receiving a length of a prototype waveform ;
e) translating in time t (first non) he unpositioned excitation waveform to the received position and adjusting its amplitude to the received amplitude to provide an unperiodicized excitation waveform ;
f) periodicizing said unperiodicized excitation waveform according to the prototype waveform length ;
g) calculating the prototype waveform from the LPC filter parameters and the periodicized excitation waveform ;
h) receiving interpolation parameters for prototype waveform interpolation ;
and i) reconstructing said sampled voiced speech signal by performing prototype waveform interpolation using the interpolation parameters .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (autocorrelation coefficients) of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (when p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5809456A
CLAIM 1
. A method of coding a sampled voiced speech signal , said voiced speech signal containing a repetition of a prototype waveform , the method comprising the steps of : a) taking a segment of said sampled voiced speech signal the segment having a length equal to the length of the prototype waveform , and extending the sampled voiced speech signal using the period of the prototype waveform ;
b) calculating a series of autocorrelation coefficients (LP filter, LP filter excitation signal) of said extended sampled voiced speech signal segment ;
c) calculating , from said series of autocorrelation coefficients , a series of linear predictive coding (LPC) coefficients , relative to a synthesis filter the synthesis filter outputting a synthesized waveform when p (E q) rovided as input an excitation waveform ;
d) determining the excitation waveform of said synthesis filter in terms of the LPC coefficients and a single phase-adapted pulse , the single pulse phase-adapted so that the signal coming out from said synthesis filter is minimally distorted with respect to said sampled speech signal segment ;
and e) quantizing said series of LPC coefficients and said excitation waveform .

US5809456A
CLAIM 9
. A method of decoding an encoded sampled voiced speech signal , the method comprising the steps of : a) receiving a set of linear predictive coding (LPC) filter parameters ;
b) receiving an excitation waveform in terms of excitation parameters , said excitation parameters including amplitude , phase and position information ;
c) performing an inverse transform to obtain an unpositioned excitation waveform ;
d) receiving a length of a prototype waveform ;
e) translating in time t (first non) he unpositioned excitation waveform to the received position and adjusting its amplitude to the received amplitude to provide an unperiodicized excitation waveform ;
f) periodicizing said unperiodicized excitation waveform according to the prototype waveform length ;
g) calculating the prototype waveform from the LPC filter parameters and the periodicized excitation waveform ;
h) receiving interpolation parameters for prototype waveform interpolation ;
and i) reconstructing said sampled voiced speech signal by performing prototype waveform interpolation using the interpolation parameters .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (time t) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5809456A
CLAIM 9
. A method of decoding an encoded sampled voiced speech signal , the method comprising the steps of : a) receiving a set of linear predictive coding (LPC) filter parameters ;
b) receiving an excitation waveform in terms of excitation parameters , said excitation parameters including amplitude , phase and position information ;
c) performing an inverse transform to obtain an unpositioned excitation waveform ;
d) receiving a length of a prototype waveform ;
e) translating in time t (first non) he unpositioned excitation waveform to the received position and adjusting its amplitude to the received amplitude to provide an unperiodicized excitation waveform ;
f) periodicizing said unperiodicized excitation waveform according to the prototype waveform length ;
g) calculating the prototype waveform from the LPC filter parameters and the periodicized excitation waveform ;
h) receiving interpolation parameters for prototype waveform interpolation ;
and i) reconstructing said sampled voiced speech signal by performing prototype waveform interpolation using the interpolation parameters .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (time t) erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5809456A
CLAIM 9
. A method of decoding an encoded sampled voiced speech signal , the method comprising the steps of : a) receiving a set of linear predictive coding (LPC) filter parameters ;
b) receiving an excitation waveform in terms of excitation parameters , said excitation parameters including amplitude , phase and position information ;
c) performing an inverse transform to obtain an unpositioned excitation waveform ;
d) receiving a length of a prototype waveform ;
e) translating in time t (first non) he unpositioned excitation waveform to the received position and adjusting its amplitude to the received amplitude to provide an unperiodicized excitation waveform ;
f) periodicizing said unperiodicized excitation waveform according to the prototype waveform length ;
g) calculating the prototype waveform from the LPC filter parameters and the periodicized excitation waveform ;
h) receiving interpolation parameters for prototype waveform interpolation ;
and i) reconstructing said sampled voiced speech signal by performing prototype waveform interpolation using the interpolation parameters .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (time t) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (filter output) and the first non erased frame received after frame erasure is encoded as active speech .
US5809456A
CLAIM 1
. A method of coding a sampled voiced speech signal , said voiced speech signal containing a repetition of a prototype waveform , the method comprising the steps of : a) taking a segment of said sampled voiced speech signal the segment having a length equal to the length of the prototype waveform , and extending the sampled voiced speech signal using the period of the prototype waveform ;
b) calculating a series of autocorrelation coefficients of said extended sampled voiced speech signal segment ;
c) calculating , from said series of autocorrelation coefficients , a series of linear predictive coding (LPC) coefficients , relative to a synthesis filter the synthesis filter output (comfort noise) ting a synthesized waveform when provided as input an excitation waveform ;
d) determining the excitation waveform of said synthesis filter in terms of the LPC coefficients and a single phase-adapted pulse , the single pulse phase-adapted so that the signal coming out from said synthesis filter is minimally distorted with respect to said sampled speech signal segment ;
and e) quantizing said series of LPC coefficients and said excitation waveform .

US5809456A
CLAIM 9
. A method of decoding an encoded sampled voiced speech signal , the method comprising the steps of : a) receiving a set of linear predictive coding (LPC) filter parameters ;
b) receiving an excitation waveform in terms of excitation parameters , said excitation parameters including amplitude , phase and position information ;
c) performing an inverse transform to obtain an unpositioned excitation waveform ;
d) receiving a length of a prototype waveform ;
e) translating in time t (first non) he unpositioned excitation waveform to the received position and adjusting its amplitude to the received amplitude to provide an unperiodicized excitation waveform ;
f) periodicizing said unperiodicized excitation waveform according to the prototype waveform length ;
g) calculating the prototype waveform from the LPC filter parameters and the periodicized excitation waveform ;
h) receiving interpolation parameters for prototype waveform interpolation ;
and i) reconstructing said sampled voiced speech signal by performing prototype waveform interpolation using the interpolation parameters .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (autocorrelation coefficients) of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5809456A
CLAIM 1
. A method of coding a sampled voiced speech signal , said voiced speech signal containing a repetition of a prototype waveform , the method comprising the steps of : a) taking a segment of said sampled voiced speech signal the segment having a length equal to the length of the prototype waveform , and extending the sampled voiced speech signal using the period of the prototype waveform ;
b) calculating a series of autocorrelation coefficients (LP filter, LP filter excitation signal) of said extended sampled voiced speech signal segment ;
c) calculating , from said series of autocorrelation coefficients , a series of linear predictive coding (LPC) coefficients , relative to a synthesis filter the synthesis filter outputting a synthesized waveform when provided as input an excitation waveform ;
d) determining the excitation waveform of said synthesis filter in terms of the LPC coefficients and a single phase-adapted pulse , the single pulse phase-adapted so that the signal coming out from said synthesis filter is minimally distorted with respect to said sampled speech signal segment ;
and e) quantizing said series of LPC coefficients and said excitation waveform .

US5809456A
CLAIM 9
. A method of decoding an encoded sampled voiced speech signal , the method comprising the steps of : a) receiving a set of linear predictive coding (LPC) filter parameters ;
b) receiving an excitation waveform in terms of excitation parameters , said excitation parameters including amplitude , phase and position information ;
c) performing an inverse transform to obtain an unpositioned excitation waveform ;
d) receiving a length of a prototype waveform ;
e) translating in time t (first non) he unpositioned excitation waveform to the received position and adjusting its amplitude to the received amplitude to provide an unperiodicized excitation waveform ;
f) periodicizing said unperiodicized excitation waveform according to the prototype waveform length ;
g) calculating the prototype waveform from the LPC filter parameters and the periodicized excitation waveform ;
h) receiving interpolation parameters for prototype waveform interpolation ;
and i) reconstructing said sampled voiced speech signal by performing prototype waveform interpolation using the interpolation parameters .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (autocorrelation coefficients) excitation signal produced in the decoder during the received first non (time t) erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (when p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5809456A
CLAIM 1
. A method of coding a sampled voiced speech signal , said voiced speech signal containing a repetition of a prototype waveform , the method comprising the steps of : a) taking a segment of said sampled voiced speech signal the segment having a length equal to the length of the prototype waveform , and extending the sampled voiced speech signal using the period of the prototype waveform ;
b) calculating a series of autocorrelation coefficients (LP filter, LP filter excitation signal) of said extended sampled voiced speech signal segment ;
c) calculating , from said series of autocorrelation coefficients , a series of linear predictive coding (LPC) coefficients , relative to a synthesis filter the synthesis filter outputting a synthesized waveform when p (E q) rovided as input an excitation waveform ;
d) determining the excitation waveform of said synthesis filter in terms of the LPC coefficients and a single phase-adapted pulse , the single pulse phase-adapted so that the signal coming out from said synthesis filter is minimally distorted with respect to said sampled speech signal segment ;
and e) quantizing said series of LPC coefficients and said excitation waveform .

US5809456A
CLAIM 9
. A method of decoding an encoded sampled voiced speech signal , the method comprising the steps of : a) receiving a set of linear predictive coding (LPC) filter parameters ;
b) receiving an excitation waveform in terms of excitation parameters , said excitation parameters including amplitude , phase and position information ;
c) performing an inverse transform to obtain an unpositioned excitation waveform ;
d) receiving a length of a prototype waveform ;
e) translating in time t (first non) he unpositioned excitation waveform to the received position and adjusting its amplitude to the received amplitude to provide an unperiodicized excitation waveform ;
f) periodicizing said unperiodicized excitation waveform according to the prototype waveform length ;
g) calculating the prototype waveform from the LPC filter parameters and the periodicized excitation waveform ;
h) receiving interpolation parameters for prototype waveform interpolation ;
and i) reconstructing said sampled voiced speech signal by performing prototype waveform interpolation using the interpolation parameters .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (autocorrelation coefficients) of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (when p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5809456A
CLAIM 1
. A method of coding a sampled voiced speech signal , said voiced speech signal containing a repetition of a prototype waveform , the method comprising the steps of : a) taking a segment of said sampled voiced speech signal the segment having a length equal to the length of the prototype waveform , and extending the sampled voiced speech signal using the period of the prototype waveform ;
b) calculating a series of autocorrelation coefficients (LP filter, LP filter excitation signal) of said extended sampled voiced speech signal segment ;
c) calculating , from said series of autocorrelation coefficients , a series of linear predictive coding (LPC) coefficients , relative to a synthesis filter the synthesis filter outputting a synthesized waveform when p (E q) rovided as input an excitation waveform ;
d) determining the excitation waveform of said synthesis filter in terms of the LPC coefficients and a single phase-adapted pulse , the single pulse phase-adapted so that the signal coming out from said synthesis filter is minimally distorted with respect to said sampled speech signal segment ;
and e) quantizing said series of LPC coefficients and said excitation waveform .

US5809456A
CLAIM 9
. A method of decoding an encoded sampled voiced speech signal , the method comprising the steps of : a) receiving a set of linear predictive coding (LPC) filter parameters ;
b) receiving an excitation waveform in terms of excitation parameters , said excitation parameters including amplitude , phase and position information ;
c) performing an inverse transform to obtain an unpositioned excitation waveform ;
d) receiving a length of a prototype waveform ;
e) translating in time t (first non) he unpositioned excitation waveform to the received position and adjusting its amplitude to the received amplitude to provide an unperiodicized excitation waveform ;
f) periodicizing said unperiodicized excitation waveform according to the prototype waveform length ;
g) calculating the prototype waveform from the LPC filter parameters and the periodicized excitation waveform ;
h) receiving interpolation parameters for prototype waveform interpolation ;
and i) reconstructing said sampled voiced speech signal by performing prototype waveform interpolation using the interpolation parameters .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5819298A

Filed: 1996-06-24     Issued: 1998-10-06

File allocation tables with holes

(Original Assignee) Sun Microsystems Inc     (Current Assignee) Oracle America Inc

Thomas K. Wong, Peter W. Madany
US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (first one) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5819298A
CLAIM 5
. A method as recited in claim 1 wherein the file system includes a plurality of file allocation table extensions wherein a first one (first non) of the file allocation table extensions is arranged to indicate holes in the data to be stored and a second one of the file allocation table extensions is arranged to indicate data segments that are compressed .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (first one) erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5819298A
CLAIM 5
. A method as recited in claim 1 wherein the file system includes a plurality of file allocation table extensions wherein a first one (first non) of the file allocation table extensions is arranged to indicate holes in the data to be stored and a second one of the file allocation table extensions is arranged to indicate data segments that are compressed .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (first one) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5819298A
CLAIM 5
. A method as recited in claim 1 wherein the file system includes a plurality of file allocation table extensions wherein a first one (first non) of the file allocation table extensions is arranged to indicate holes in the data to be stored and a second one of the file allocation table extensions is arranged to indicate data segments that are compressed .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (first one) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a, when i) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5819298A
CLAIM 1
. A computer-implemented method of storing data using a file system that includes a file allocation table (FAT) having a plurality of FAT elements , each FAT element of the file allocation table corresponding to a unique region of mass storage and being arranged to represent the status of that unique region of mass storage , the method comprising the steps of : a) requesting that a first data segment be written to mass storage ;
b) determining whether the first data segment may be represented by a hole ;
and c) wherein when i (LP filter excitation signal) t is determined that the first data segment may be represented by a hole , the method further includes the step of storing a first status indicator in a file allocation table extension at a first extension element , said file allocation table extension being included in said file system and having a plurality of extension elements , wherein said extension elements of said file allocation table extension do not correspond to any region in mass storage , thereby indicating that the first data segment is not stored in the mass storage of the computer .

US5819298A
CLAIM 4
. A method as recited in claim 1 wherein the step of determining whether the first data segment represents a (LP filter excitation signal) hole includes the sub-step of determining whether the first data segment is a uniform sequence of bits that does not need to be represented in mass storage .

US5819298A
CLAIM 5
. A method as recited in claim 1 wherein the file system includes a plurality of file allocation table extensions wherein a first one (first non) of the file allocation table extensions is arranged to indicate holes in the data to be stored and a second one of the file allocation table extensions is arranged to indicate data segments that are compressed .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (represents a, when i) produced in the decoder during the received first non (first one) erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5819298A
CLAIM 1
. A computer-implemented method of storing data using a file system that includes a file allocation table (FAT) having a plurality of FAT elements , each FAT element of the file allocation table corresponding to a unique region of mass storage and being arranged to represent the status of that unique region of mass storage , the method comprising the steps of : a) requesting that a first data segment be written to mass storage ;
b) determining whether the first data segment may be represented by a hole ;
and c) wherein when i (LP filter excitation signal) t is determined that the first data segment may be represented by a hole , the method further includes the step of storing a first status indicator in a file allocation table extension at a first extension element , said file allocation table extension being included in said file system and having a plurality of extension elements , wherein said extension elements of said file allocation table extension do not correspond to any region in mass storage , thereby indicating that the first data segment is not stored in the mass storage of the computer .

US5819298A
CLAIM 4
. A method as recited in claim 1 wherein the step of determining whether the first data segment represents a (LP filter excitation signal) hole includes the sub-step of determining whether the first data segment is a uniform sequence of bits that does not need to be represented in mass storage .

US5819298A
CLAIM 5
. A method as recited in claim 1 wherein the file system includes a plurality of file allocation table extensions wherein a first one (first non) of the file allocation table extensions is arranged to indicate holes in the data to be stored and a second one of the file allocation table extensions is arranged to indicate data segments that are compressed .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (first one) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a, when i) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5819298A
CLAIM 1
. A computer-implemented method of storing data using a file system that includes a file allocation table (FAT) having a plurality of FAT elements , each FAT element of the file allocation table corresponding to a unique region of mass storage and being arranged to represent the status of that unique region of mass storage , the method comprising the steps of : a) requesting that a first data segment be written to mass storage ;
b) determining whether the first data segment may be represented by a hole ;
and c) wherein when i (LP filter excitation signal) t is determined that the first data segment may be represented by a hole , the method further includes the step of storing a first status indicator in a file allocation table extension at a first extension element , said file allocation table extension being included in said file system and having a plurality of extension elements , wherein said extension elements of said file allocation table extension do not correspond to any region in mass storage , thereby indicating that the first data segment is not stored in the mass storage of the computer .

US5819298A
CLAIM 4
. A method as recited in claim 1 wherein the step of determining whether the first data segment represents a (LP filter excitation signal) hole includes the sub-step of determining whether the first data segment is a uniform sequence of bits that does not need to be represented in mass storage .

US5819298A
CLAIM 5
. A method as recited in claim 1 wherein the file system includes a plurality of file allocation table extensions wherein a first one (first non) of the file allocation table extensions is arranged to indicate holes in the data to be stored and a second one of the file allocation table extensions is arranged to indicate data segments that are compressed .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (first one) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5819298A
CLAIM 5
. A method as recited in claim 1 wherein the file system includes a plurality of file allocation table extensions wherein a first one (first non) of the file allocation table extensions is arranged to indicate holes in the data to be stored and a second one of the file allocation table extensions is arranged to indicate data segments that are compressed .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (first one) erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5819298A
CLAIM 5
. A method as recited in claim 1 wherein the file system includes a plurality of file allocation table extensions wherein a first one (first non) of the file allocation table extensions is arranged to indicate holes in the data to be stored and a second one of the file allocation table extensions is arranged to indicate data segments that are compressed .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (first one) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5819298A
CLAIM 5
. A method as recited in claim 1 wherein the file system includes a plurality of file allocation table extensions wherein a first one (first non) of the file allocation table extensions is arranged to indicate holes in the data to be stored and a second one of the file allocation table extensions is arranged to indicate data segments that are compressed .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non (first one) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (represents a, when i) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5819298A
CLAIM 1
. A computer-implemented method of storing data using a file system that includes a file allocation table (FAT) having a plurality of FAT elements , each FAT element of the file allocation table corresponding to a unique region of mass storage and being arranged to represent the status of that unique region of mass storage , the method comprising the steps of : a) requesting that a first data segment be written to mass storage ;
b) determining whether the first data segment may be represented by a hole ;
and c) wherein when i (LP filter excitation signal) t is determined that the first data segment may be represented by a hole , the method further includes the step of storing a first status indicator in a file allocation table extension at a first extension element , said file allocation table extension being included in said file system and having a plurality of extension elements , wherein said extension elements of said file allocation table extension do not correspond to any region in mass storage , thereby indicating that the first data segment is not stored in the mass storage of the computer .

US5819298A
CLAIM 4
. A method as recited in claim 1 wherein the step of determining whether the first data segment represents a (LP filter excitation signal) hole includes the sub-step of determining whether the first data segment is a uniform sequence of bits that does not need to be represented in mass storage .

US5819298A
CLAIM 5
. A method as recited in claim 1 wherein the file system includes a plurality of file allocation table extensions wherein a first one (first non) of the file allocation table extensions is arranged to indicate holes in the data to be stored and a second one of the file allocation table extensions is arranged to indicate data segments that are compressed .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (represents a, when i) produced in the decoder during the received first non (first one) erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5819298A
CLAIM 1
. A computer-implemented method of storing data using a file system that includes a file allocation table (FAT) having a plurality of FAT elements , each FAT element of the file allocation table corresponding to a unique region of mass storage and being arranged to represent the status of that unique region of mass storage , the method comprising the steps of : a) requesting that a first data segment be written to mass storage ;
b) determining whether the first data segment may be represented by a hole ;
and c) wherein when i (LP filter excitation signal) t is determined that the first data segment may be represented by a hole , the method further includes the step of storing a first status indicator in a file allocation table extension at a first extension element , said file allocation table extension being included in said file system and having a plurality of extension elements , wherein said extension elements of said file allocation table extension do not correspond to any region in mass storage , thereby indicating that the first data segment is not stored in the mass storage of the computer .

US5819298A
CLAIM 4
. A method as recited in claim 1 wherein the step of determining whether the first data segment represents a (LP filter excitation signal) hole includes the sub-step of determining whether the first data segment is a uniform sequence of bits that does not need to be represented in mass storage .

US5819298A
CLAIM 5
. A method as recited in claim 1 wherein the file system includes a plurality of file allocation table extensions wherein a first one (first non) of the file allocation table extensions is arranged to indicate holes in the data to be stored and a second one of the file allocation table extensions is arranged to indicate data segments that are compressed .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non (first one) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (represents a, when i) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5819298A
CLAIM 1
. A computer-implemented method of storing data using a file system that includes a file allocation table (FAT) having a plurality of FAT elements , each FAT element of the file allocation table corresponding to a unique region of mass storage and being arranged to represent the status of that unique region of mass storage , the method comprising the steps of : a) requesting that a first data segment be written to mass storage ;
b) determining whether the first data segment may be represented by a hole ;
and c) wherein when i (LP filter excitation signal) t is determined that the first data segment may be represented by a hole , the method further includes the step of storing a first status indicator in a file allocation table extension at a first extension element , said file allocation table extension being included in said file system and having a plurality of extension elements , wherein said extension elements of said file allocation table extension do not correspond to any region in mass storage , thereby indicating that the first data segment is not stored in the mass storage of the computer .

US5819298A
CLAIM 4
. A method as recited in claim 1 wherein the step of determining whether the first data segment represents a (LP filter excitation signal) hole includes the sub-step of determining whether the first data segment is a uniform sequence of bits that does not need to be represented in mass storage .

US5819298A
CLAIM 5
. A method as recited in claim 1 wherein the file system includes a plurality of file allocation table extensions wherein a first one (first non) of the file allocation table extensions is arranged to indicate holes in the data to be stored and a second one of the file allocation table extensions is arranged to indicate data segments that are compressed .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6029128A

Filed: 1996-06-13     Issued: 2000-02-22

Speech synthesizer

(Original Assignee) Nokia Mobile Phones Ltd     (Current Assignee) Nokia Technologies Oy

Kari Jarvinen, Tero Honkanen
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (fixed codebook, linear scale) per sample for other frames .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scale (average energy) r and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook (average energy) and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E (following relationships) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6029128A
CLAIM 8
. A synthesiser for speech synthesis , comprising first and second excitation sources for respectively generating first and second excitation signals , and modifying means for modifying the first excitation signal in accordance with a scaling factor derivable from pitch information associated with the first excitation signal in order to produce an enhanced synthesised speech signal , wherein the modifying means scales the first excitation signal in accordance with a scaling factor (a) derivable from pitch information associated with the first signal , wherein the first excitation source is an adaptive code book and the second excitation source is a fixed code book , wherein the scaling factor (a) is of the form a=b+p , where b is an adaptive code book gain and p is a perceptual enhancement gain factor derivable in accordance with the following relationships (⁢ E) ;
##EQU24## where TH represents threshold values , a enh is a linear scaler and f(b) is a function of gain b , wherein the first and second excitation signals are combined after modification , and further comprising an adaptive energy control means for modifying combined scaled first and second signals in accordance with the following relationship ;
##EQU25## where N is a suitable adaption period , ex(n) is the combined first and second signals , ew' ;
(n) is the combined scaled first and second signals and K is an energy scale factor .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook, speech signal) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E (following relationships) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 8
. A synthesiser for speech synthesis , comprising first and second excitation sources for respectively generating first and second excitation signals , and modifying means for modifying the first excitation signal in accordance with a scaling factor derivable from pitch information associated with the first excitation signal in order to produce an enhanced synthesised speech signal , wherein the modifying means scales the first excitation signal in accordance with a scaling factor (a) derivable from pitch information associated with the first signal , wherein the first excitation source is an adaptive code book and the second excitation source is a fixed code book , wherein the scaling factor (a) is of the form a=b+p , where b is an adaptive code book gain and p is a perceptual enhancement gain factor derivable in accordance with the following relationships (⁢ E) ;
##EQU24## where TH represents threshold values , a enh is a linear scaler and f(b) is a function of gain b , wherein the first and second excitation signals are combined after modification , and further comprising an adaptive energy control means for modifying combined scaled first and second signals in accordance with the following relationship ;
##EQU25## where N is a suitable adaption period , ex(n) is the combined first and second signals , ew' ;
(n) is the combined scaled first and second signals and K is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook, linear scale) per sample for other frames .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scale (average energy) r and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook (average energy) and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E (following relationships) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6029128A
CLAIM 8
. A synthesiser for speech synthesis , comprising first and second excitation sources for respectively generating first and second excitation signals , and modifying means for modifying the first excitation signal in accordance with a scaling factor derivable from pitch information associated with the first excitation signal in order to produce an enhanced synthesised speech signal , wherein the modifying means scales the first excitation signal in accordance with a scaling factor (a) derivable from pitch information associated with the first signal , wherein the first excitation source is an adaptive code book and the second excitation source is a fixed code book , wherein the scaling factor (a) is of the form a=b+p , where b is an adaptive code book gain and p is a perceptual enhancement gain factor derivable in accordance with the following relationships (⁢ E) ;
##EQU24## where TH represents threshold values , a enh is a linear scaler and f(b) is a function of gain b , wherein the first and second excitation signals are combined after modification , and further comprising an adaptive energy control means for modifying combined scaled first and second signals in accordance with the following relationship ;
##EQU25## where N is a suitable adaption period , ex(n) is the combined first and second signals , ew' ;
(n) is the combined scaled first and second signals and K is an energy scale factor .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook, linear scale) per sample for other frames .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scale (average energy) r and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook (average energy) and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook, speech signal) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E (following relationships) LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6029128A
CLAIM 1
. A synthesiser for speech synthesis , comprising : an excitation source ;
and a post-processing means coupled to said excitation source for operating on a first signal including speech periodicity information derived from said excitation source , wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal (sound signal, speech signal, decoder determines concealment) ;
wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal ;
wherein the excitation source comprises a fixed code book and an adaptive code book , the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books , the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book , the first signal being modified by combining the second signal with the first signal , and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship , ##EQU13## where TH represents threshold values , b is the adaptive code book gain factor , p is the first post-processing means scale factor , a enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b , and wherein the post-processing means further comprises an adaptive energy control means adapted to scale a modified first signal in accordance with the following relationship , ##EQU14## where N is a suitably chosen adaption period , ex(n) is the first signal , ew' ;
(n) is a modified first signal and k is an energy scale factor .

US6029128A
CLAIM 8
. A synthesiser for speech synthesis , comprising first and second excitation sources for respectively generating first and second excitation signals , and modifying means for modifying the first excitation signal in accordance with a scaling factor derivable from pitch information associated with the first excitation signal in order to produce an enhanced synthesised speech signal , wherein the modifying means scales the first excitation signal in accordance with a scaling factor (a) derivable from pitch information associated with the first signal , wherein the first excitation source is an adaptive code book and the second excitation source is a fixed code book , wherein the scaling factor (a) is of the form a=b+p , where b is an adaptive code book gain and p is a perceptual enhancement gain factor derivable in accordance with the following relationships (⁢ E) ;
##EQU24## where TH represents threshold values , a enh is a linear scaler and f(b) is a function of gain b , wherein the first and second excitation signals are combined after modification , and further comprising an adaptive energy control means for modifying combined scaled first and second signals in accordance with the following relationship ;
##EQU25## where N is a suitable adaption period , ex(n) is the combined first and second signals , ew' ;
(n) is the combined scaled first and second signals and K is an energy scale factor .

US6029128A
CLAIM 12
. A synthesiser for speech synthesis , comprising ;
an input unit for inputting a signal and for extracting coded information from said signal , the coded information comprising fixed codebook and adaptive codebook (sound signal, speech signal, decoder determines concealment) parameters , including an adaptive codebook gain factor ;
an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom , said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook , said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal ;
and a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal , wherein an amount of scaling of said second partial excitation signal is controlled by a scalincg factor having a value that is function of a value of said adaptive codebook gain factor ;
wherein said scaling factor (p) is derived from said adaptive code book gain factor (b) in accordance with the relationships , ##EQU31## where a enh is a constant that controls a strength of perceptual enhancement and TH are threshold values .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH09120298A

Filed: 1996-06-07     Issued: 1997-05-06

フレーム消失の間の音声復号に使用する音声の有声/無声分類

(Original Assignee) At & T Ipm Corp; エイ・ティ・アンド・ティ・アイピーエム・コーポレーション     

Peter Kroon, Yair Shoham, ショーハム エイア, クルーン ピーター
US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH09120298A
CLAIM 4
【請求項4】 前記第1部分からの出力信号は前記適応 コードブックからのベクトル信号に基づいて生成され、 前記方法は、 前フレームで前記復号器によって受信された音声信号の ピッチ周期の尺度に基づいて適応コードブック遅延信号 を求めるステップと、 前記適応コードブック遅延信号を用いてベクトル信号を 選択するステップとをさらに有すること (maximum amplitude) を特徴とする請 求項1の方法。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声復号方法, の音声信号, 音声情報) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JPH09120298A
CLAIM 1
【請求項1】 適応コードブックからなる第1部分と、 固定コードブックからなる第2部分とを有し、現フレー ムの圧縮音声情報 (speech signal) の少なくとも一部を信頼性良く受信す ることができない場合に前記第1部分または第2部分か らの出力信号に基づいて音声励振信号を生成する音声復 号器で使用する音声復号方法 (speech signal) において、 前記復号器によって生成すべき音声信号を周期的または 非周期的と分類する分類ステップと、 音声信号が周期的と分類された場合に前記第1部分から の出力信号に基づいて前記第2部分からの出力信号には 基づかずに前記励振信号を生成し、音声信号が非周期的 と分類された場合に前記第2部分からの出力信号に基づ いて前記第1部分からの出力信号には基づかずに前記励 振信号を生成するステップとからなることを特徴とする 音声復号方法

JPH09120298A
CLAIM 5
【請求項5】 前記適応コードブック遅延信号を求める ステップは、一つ以上の音声信号 (speech signal) サンプル期間だけ音声 信号のピッチ周期を増加させるステップからなることを 特徴とする請求項4の方法。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (音声復号方法, の音声信号, 音声情報) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JPH09120298A
CLAIM 1
【請求項1】 適応コードブックからなる第1部分と、 固定コードブックからなる第2部分とを有し、現フレー ムの圧縮音声情報 (speech signal) の少なくとも一部を信頼性良く受信す ることができない場合に前記第1部分または第2部分か らの出力信号に基づいて音声励振信号を生成する音声復 号器で使用する音声復号方法 (speech signal) において、 前記復号器によって生成すべき音声信号を周期的または 非周期的と分類する分類ステップと、 音声信号が周期的と分類された場合に前記第1部分から の出力信号に基づいて前記第2部分からの出力信号には 基づかずに前記励振信号を生成し、音声信号が非周期的 と分類された場合に前記第2部分からの出力信号に基づ いて前記第1部分からの出力信号には基づかずに前記励 振信号を生成するステップとからなることを特徴とする 音声復号方法

JPH09120298A
CLAIM 5
【請求項5】 前記適応コードブック遅延信号を求める ステップは、一つ以上の音声信号 (speech signal) サンプル期間だけ音声 信号のピッチ周期を増加させるステップからなることを 特徴とする請求項4の方法。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (音声復号方法, の音声信号, 音声情報) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH09120298A
CLAIM 1
【請求項1】 適応コードブックからなる第1部分と、 固定コードブックからなる第2部分とを有し、現フレー ムの圧縮音声情報 (speech signal) の少なくとも一部を信頼性良く受信す ることができない場合に前記第1部分または第2部分か らの出力信号に基づいて音声励振信号を生成する音声復 号器で使用する音声復号方法 (speech signal) において、 前記復号器によって生成すべき音声信号を周期的または 非周期的と分類する分類ステップと、 音声信号が周期的と分類された場合に前記第1部分から の出力信号に基づいて前記第2部分からの出力信号には 基づかずに前記励振信号を生成し、音声信号が非周期的 と分類された場合に前記第2部分からの出力信号に基づ いて前記第1部分からの出力信号には基づかずに前記励 振信号を生成するステップとからなることを特徴とする 音声復号方法

JPH09120298A
CLAIM 5
【請求項5】 前記適応コードブック遅延信号を求める ステップは、一つ以上の音声信号 (speech signal) サンプル期間だけ音声 信号のピッチ周期を増加させるステップからなることを 特徴とする請求項4の方法。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH09120298A
CLAIM 4
【請求項4】 前記第1部分からの出力信号は前記適応 コードブックからのベクトル信号に基づいて生成され、 前記方法は、 前フレームで前記復号器によって受信された音声信号の ピッチ周期の尺度に基づいて適応コードブック遅延信号 を求めるステップと、 前記適応コードブック遅延信号を用いてベクトル信号を 選択するステップとをさらに有すること (maximum amplitude) を特徴とする請 求項1の方法。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH09120298A
CLAIM 4
【請求項4】 前記第1部分からの出力信号は前記適応 コードブックからのベクトル信号に基づいて生成され、 前記方法は、 前フレームで前記復号器によって受信された音声信号の ピッチ周期の尺度に基づいて適応コードブック遅延信号 を求めるステップと、 前記適応コードブック遅延信号を用いてベクトル信号を 選択するステップとをさらに有すること (maximum amplitude) を特徴とする請 求項1の方法。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声復号方法, の音声信号, 音声情報) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH09120298A
CLAIM 1
【請求項1】 適応コードブックからなる第1部分と、 固定コードブックからなる第2部分とを有し、現フレー ムの圧縮音声情報 (speech signal) の少なくとも一部を信頼性良く受信す ることができない場合に前記第1部分または第2部分か らの出力信号に基づいて音声励振信号を生成する音声復 号器で使用する音声復号方法 (speech signal) において、 前記復号器によって生成すべき音声信号を周期的または 非周期的と分類する分類ステップと、 音声信号が周期的と分類された場合に前記第1部分から の出力信号に基づいて前記第2部分からの出力信号には 基づかずに前記励振信号を生成し、音声信号が非周期的 と分類された場合に前記第2部分からの出力信号に基づ いて前記第1部分からの出力信号には基づかずに前記励 振信号を生成するステップとからなることを特徴とする 音声復号方法

JPH09120298A
CLAIM 5
【請求項5】 前記適応コードブック遅延信号を求める ステップは、一つ以上の音声信号 (speech signal) サンプル期間だけ音声 信号のピッチ周期を増加させるステップからなることを 特徴とする請求項4の方法。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (音声復号方法, の音声信号, 音声情報) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JPH09120298A
CLAIM 1
【請求項1】 適応コードブックからなる第1部分と、 固定コードブックからなる第2部分とを有し、現フレー ムの圧縮音声情報 (speech signal) の少なくとも一部を信頼性良く受信す ることができない場合に前記第1部分または第2部分か らの出力信号に基づいて音声励振信号を生成する音声復 号器で使用する音声復号方法 (speech signal) において、 前記復号器によって生成すべき音声信号を周期的または 非周期的と分類する分類ステップと、 音声信号が周期的と分類された場合に前記第1部分から の出力信号に基づいて前記第2部分からの出力信号には 基づかずに前記励振信号を生成し、音声信号が非周期的 と分類された場合に前記第2部分からの出力信号に基づ いて前記第1部分からの出力信号には基づかずに前記励 振信号を生成するステップとからなることを特徴とする 音声復号方法

JPH09120298A
CLAIM 5
【請求項5】 前記適応コードブック遅延信号を求める ステップは、一つ以上の音声信号 (speech signal) サンプル期間だけ音声 信号のピッチ周期を増加させるステップからなることを 特徴とする請求項4の方法。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (音声復号方法, の音声信号, 音声情報) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH09120298A
CLAIM 1
【請求項1】 適応コードブックからなる第1部分と、 固定コードブックからなる第2部分とを有し、現フレー ムの圧縮音声情報 (speech signal) の少なくとも一部を信頼性良く受信す ることができない場合に前記第1部分または第2部分か らの出力信号に基づいて音声励振信号を生成する音声復 号器で使用する音声復号方法 (speech signal) において、 前記復号器によって生成すべき音声信号を周期的または 非周期的と分類する分類ステップと、 音声信号が周期的と分類された場合に前記第1部分から の出力信号に基づいて前記第2部分からの出力信号には 基づかずに前記励振信号を生成し、音声信号が非周期的 と分類された場合に前記第2部分からの出力信号に基づ いて前記第1部分からの出力信号には基づかずに前記励 振信号を生成するステップとからなることを特徴とする 音声復号方法

JPH09120298A
CLAIM 5
【請求項5】 前記適応コードブック遅延信号を求める ステップは、一つ以上の音声信号 (speech signal) サンプル期間だけ音声 信号のピッチ周期を増加させるステップからなることを 特徴とする請求項4の方法。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH09120298A
CLAIM 4
【請求項4】 前記第1部分からの出力信号は前記適応 コードブックからのベクトル信号に基づいて生成され、 前記方法は、 前フレームで前記復号器によって受信された音声信号の ピッチ周期の尺度に基づいて適応コードブック遅延信号 を求めるステップと、 前記適応コードブック遅延信号を用いてベクトル信号を 選択するステップとをさらに有すること (maximum amplitude) を特徴とする請 求項1の方法。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声復号方法, の音声信号, 音声情報) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH09120298A
CLAIM 1
【請求項1】 適応コードブックからなる第1部分と、 固定コードブックからなる第2部分とを有し、現フレー ムの圧縮音声情報 (speech signal) の少なくとも一部を信頼性良く受信す ることができない場合に前記第1部分または第2部分か らの出力信号に基づいて音声励振信号を生成する音声復 号器で使用する音声復号方法 (speech signal) において、 前記復号器によって生成すべき音声信号を周期的または 非周期的と分類する分類ステップと、 音声信号が周期的と分類された場合に前記第1部分から の出力信号に基づいて前記第2部分からの出力信号には 基づかずに前記励振信号を生成し、音声信号が非周期的 と分類された場合に前記第2部分からの出力信号に基づ いて前記第1部分からの出力信号には基づかずに前記励 振信号を生成するステップとからなることを特徴とする 音声復号方法

JPH09120298A
CLAIM 5
【請求項5】 前記適応コードブック遅延信号を求める ステップは、一つ以上の音声信号 (speech signal) サンプル期間だけ音声 信号のピッチ周期を増加させるステップからなることを 特徴とする請求項4の方法。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5884252A

Filed: 1996-05-31     Issued: 1999-03-16

Method of and apparatus for coding speech signal

(Original Assignee) NEC Corp     (Current Assignee) Rakuten Inc

Kazunori Ozawa
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse response, response signal) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response, response signal) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US5884252A
CLAIM 17
. The apparatus according to claim 1 , further comprising : an impulse response (impulse responses, impulse response, LP filter) calculator for receiving the quantized spectral parameters and for calculating and outputting an impulse response of a weighting filter based on the quantized spectral parameters ;
a weighting signal calculator for weighting the quantized gain output by said gain quantizer and for outputting a weighted signal as a result thereof ;
a response signal (impulse responses, impulse response, LP filter) quantizer for receiving the quantization candidates from said spectral parameter calculator for each of a plurality of subframes , and for calculating and outputting , using a stored value of a filter memory , a response signal for one subframe ;
an audio weighting circuit for receiving the inputted speech signal divided into subframes and for receiving the plurality of quantization candidates from said spectral parameter calculator , for calculating an audio weighting on the speech signal in each of the subframes , and for outputting an audio-weighted speech signal as a result thereof ;
and a subtractor for subtracting the audio-weighted speech signal from the response signal to produce a subtracted signal as a result ;
wherein said adaptive code book comprises : a delay searching and distortion calculating circuit which receives the past excitation signal on a first input terminal , the subtracted signal on a second input terminal , and the impulse response on a third input terminal , and for determining the delay as a result ;
a decision circuit for receiving a plurality of distortions and corresponding delays from the delay searching and distortion calculating circuit , and for determining the delay which provides the minimum distortion between the speech signal and said pitch predictive signal ;
and a residual calculator connected to receive the delay which provides the minimum distortion from said decision circuit and for effecting pitch prediction to determine a corresponding pitch predictive signal that is output to said excitation quantizer .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (impulse response, response signal) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5884252A
CLAIM 5
. An apparatus for coding a speech signal , comprising : a mode decision unit for deciding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
an adaptive code book for determining delay with respect to each of said quantization candidates , respectively , outputted from said spectral parameter quantizer , generating a pitch predective signal based on a past excitation signal for each of the delays and associating quantization candidates , and outputting a quantization candidate and a delay which provide a minimum distortion between the speech signal and said pitch predective signal , if the mode decision information outputted from said mode decision unit represents a (LP filter excitation signal) predetermined mode ;
a excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US5884252A
CLAIM 17
. The apparatus according to claim 1 , further comprising : an impulse response (impulse responses, impulse response, LP filter) calculator for receiving the quantized spectral parameters and for calculating and outputting an impulse response of a weighting filter based on the quantized spectral parameters ;
a weighting signal calculator for weighting the quantized gain output by said gain quantizer and for outputting a weighted signal as a result thereof ;
a response signal (impulse responses, impulse response, LP filter) quantizer for receiving the quantization candidates from said spectral parameter calculator for each of a plurality of subframes , and for calculating and outputting , using a stored value of a filter memory , a response signal for one subframe ;
an audio weighting circuit for receiving the inputted speech signal divided into subframes and for receiving the plurality of quantization candidates from said spectral parameter calculator , for calculating an audio weighting on the speech signal in each of the subframes , and for outputting an audio-weighted speech signal as a result thereof ;
and a subtractor for subtracting the audio-weighted speech signal from the response signal to produce a subtracted signal as a result ;
wherein said adaptive code book comprises : a delay searching and distortion calculating circuit which receives the past excitation signal on a first input terminal (LP filter excitation signal) , the subtracted signal on a second input terminal , and the impulse response on a third input terminal , and for determining the delay as a result ;
a decision circuit for receiving a plurality of distortions and corresponding delays from the delay searching and distortion calculating circuit , and for determining the delay which provides the minimum distortion between the speech signal and said pitch predictive signal ;
and a residual calculator connected to receive the delay which provides the minimum distortion from said decision circuit and for effecting pitch prediction to determine a corresponding pitch predictive signal that is output to said excitation quantizer .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (impulse response, response signal) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse response, response signal) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5884252A
CLAIM 5
. An apparatus for coding a speech signal , comprising : a mode decision unit for deciding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
an adaptive code book for determining delay with respect to each of said quantization candidates , respectively , outputted from said spectral parameter quantizer , generating a pitch predective signal based on a past excitation signal for each of the delays and associating quantization candidates , and outputting a quantization candidate and a delay which provide a minimum distortion between the speech signal and said pitch predective signal , if the mode decision information outputted from said mode decision unit represents a (LP filter excitation signal) predetermined mode ;
a excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US5884252A
CLAIM 17
. The apparatus according to claim 1 , further comprising : an impulse response (impulse responses, impulse response, LP filter) calculator for receiving the quantized spectral parameters and for calculating and outputting an impulse response of a weighting filter based on the quantized spectral parameters ;
a weighting signal calculator for weighting the quantized gain output by said gain quantizer and for outputting a weighted signal as a result thereof ;
a response signal (impulse responses, impulse response, LP filter) quantizer for receiving the quantization candidates from said spectral parameter calculator for each of a plurality of subframes , and for calculating and outputting , using a stored value of a filter memory , a response signal for one subframe ;
an audio weighting circuit for receiving the inputted speech signal divided into subframes and for receiving the plurality of quantization candidates from said spectral parameter calculator , for calculating an audio weighting on the speech signal in each of the subframes , and for outputting an audio-weighted speech signal as a result thereof ;
and a subtractor for subtracting the audio-weighted speech signal from the response signal to produce a subtracted signal as a result ;
wherein said adaptive code book comprises : a delay searching and distortion calculating circuit which receives the past excitation signal on a first input terminal (LP filter excitation signal) , the subtracted signal on a second input terminal , and the impulse response on a third input terminal , and for determining the delay as a result ;
a decision circuit for receiving a plurality of distortions and corresponding delays from the delay searching and distortion calculating circuit , and for determining the delay which provides the minimum distortion between the speech signal and said pitch predictive signal ;
and a residual calculator connected to receive the delay which provides the minimum distortion from said decision circuit and for effecting pitch prediction to determine a corresponding pitch predictive signal that is output to said excitation quantizer .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (impulse response, response signal) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (impulse response, response signal) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5884252A
CLAIM 5
. An apparatus for coding a speech signal , comprising : a mode decision unit for deciding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
an adaptive code book for determining delay with respect to each of said quantization candidates , respectively , outputted from said spectral parameter quantizer , generating a pitch predective signal based on a past excitation signal for each of the delays and associating quantization candidates , and outputting a quantization candidate and a delay which provide a minimum distortion between the speech signal and said pitch predective signal , if the mode decision information outputted from said mode decision unit represents a (LP filter excitation signal) predetermined mode ;
a excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US5884252A
CLAIM 17
. The apparatus according to claim 1 , further comprising : an impulse response (impulse responses, impulse response, LP filter) calculator for receiving the quantized spectral parameters and for calculating and outputting an impulse response of a weighting filter based on the quantized spectral parameters ;
a weighting signal calculator for weighting the quantized gain output by said gain quantizer and for outputting a weighted signal as a result thereof ;
a response signal (impulse responses, impulse response, LP filter) quantizer for receiving the quantization candidates from said spectral parameter calculator for each of a plurality of subframes , and for calculating and outputting , using a stored value of a filter memory , a response signal for one subframe ;
an audio weighting circuit for receiving the inputted speech signal divided into subframes and for receiving the plurality of quantization candidates from said spectral parameter calculator , for calculating an audio weighting on the speech signal in each of the subframes , and for outputting an audio-weighted speech signal as a result thereof ;
and a subtractor for subtracting the audio-weighted speech signal from the response signal to produce a subtracted signal as a result ;
wherein said adaptive code book comprises : a delay searching and distortion calculating circuit which receives the past excitation signal on a first input terminal (LP filter excitation signal) , the subtracted signal on a second input terminal , and the impulse response on a third input terminal , and for determining the delay as a result ;
a decision circuit for receiving a plurality of distortions and corresponding delays from the delay searching and distortion calculating circuit , and for determining the delay which provides the minimum distortion between the speech signal and said pitch predictive signal ;
and a residual calculator connected to receive the delay which provides the minimum distortion from said decision circuit and for effecting pitch prediction to determine a corresponding pitch predictive signal that is output to said excitation quantizer .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response (impulse response, response signal) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (impulse response, response signal) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US5884252A
CLAIM 17
. The apparatus according to claim 1 , further comprising : an impulse response (impulse responses, impulse response, LP filter) calculator for receiving the quantized spectral parameters and for calculating and outputting an impulse response of a weighting filter based on the quantized spectral parameters ;
a weighting signal calculator for weighting the quantized gain output by said gain quantizer and for outputting a weighted signal as a result thereof ;
a response signal (impulse responses, impulse response, LP filter) quantizer for receiving the quantization candidates from said spectral parameter calculator for each of a plurality of subframes , and for calculating and outputting , using a stored value of a filter memory , a response signal for one subframe ;
an audio weighting circuit for receiving the inputted speech signal divided into subframes and for receiving the plurality of quantization candidates from said spectral parameter calculator , for calculating an audio weighting on the speech signal in each of the subframes , and for outputting an audio-weighted speech signal as a result thereof ;
and a subtractor for subtracting the audio-weighted speech signal from the response signal to produce a subtracted signal as a result ;
wherein said adaptive code book comprises : a delay searching and distortion calculating circuit which receives the past excitation signal on a first input terminal , the subtracted signal on a second input terminal , and the impulse response on a third input terminal , and for determining the delay as a result ;
a decision circuit for receiving a plurality of distortions and corresponding delays from the delay searching and distortion calculating circuit , and for determining the delay which provides the minimum distortion between the speech signal and said pitch predictive signal ;
and a residual calculator connected to receive the delay which provides the minimum distortion from said decision circuit and for effecting pitch prediction to determine a corresponding pitch predictive signal that is output to said excitation quantizer .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (impulse response, response signal) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5884252A
CLAIM 5
. An apparatus for coding a speech signal , comprising : a mode decision unit for deciding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
an adaptive code book for determining delay with respect to each of said quantization candidates , respectively , outputted from said spectral parameter quantizer , generating a pitch predective signal based on a past excitation signal for each of the delays and associating quantization candidates , and outputting a quantization candidate and a delay which provide a minimum distortion between the speech signal and said pitch predective signal , if the mode decision information outputted from said mode decision unit represents a (LP filter excitation signal) predetermined mode ;
a excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US5884252A
CLAIM 17
. The apparatus according to claim 1 , further comprising : an impulse response (impulse responses, impulse response, LP filter) calculator for receiving the quantized spectral parameters and for calculating and outputting an impulse response of a weighting filter based on the quantized spectral parameters ;
a weighting signal calculator for weighting the quantized gain output by said gain quantizer and for outputting a weighted signal as a result thereof ;
a response signal (impulse responses, impulse response, LP filter) quantizer for receiving the quantization candidates from said spectral parameter calculator for each of a plurality of subframes , and for calculating and outputting , using a stored value of a filter memory , a response signal for one subframe ;
an audio weighting circuit for receiving the inputted speech signal divided into subframes and for receiving the plurality of quantization candidates from said spectral parameter calculator , for calculating an audio weighting on the speech signal in each of the subframes , and for outputting an audio-weighted speech signal as a result thereof ;
and a subtractor for subtracting the audio-weighted speech signal from the response signal to produce a subtracted signal as a result ;
wherein said adaptive code book comprises : a delay searching and distortion calculating circuit which receives the past excitation signal on a first input terminal (LP filter excitation signal) , the subtracted signal on a second input terminal , and the impulse response on a third input terminal , and for determining the delay as a result ;
a decision circuit for receiving a plurality of distortions and corresponding delays from the delay searching and distortion calculating circuit , and for determining the delay which provides the minimum distortion between the speech signal and said pitch predictive signal ;
and a residual calculator connected to receive the delay which provides the minimum distortion from said decision circuit and for effecting pitch prediction to determine a corresponding pitch predictive signal that is output to said excitation quantizer .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (impulse response, response signal) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse response, response signal) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5884252A
CLAIM 5
. An apparatus for coding a speech signal , comprising : a mode decision unit for deciding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
an adaptive code book for determining delay with respect to each of said quantization candidates , respectively , outputted from said spectral parameter quantizer , generating a pitch predective signal based on a past excitation signal for each of the delays and associating quantization candidates , and outputting a quantization candidate and a delay which provide a minimum distortion between the speech signal and said pitch predective signal , if the mode decision information outputted from said mode decision unit represents a (LP filter excitation signal) predetermined mode ;
a excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US5884252A
CLAIM 17
. The apparatus according to claim 1 , further comprising : an impulse response (impulse responses, impulse response, LP filter) calculator for receiving the quantized spectral parameters and for calculating and outputting an impulse response of a weighting filter based on the quantized spectral parameters ;
a weighting signal calculator for weighting the quantized gain output by said gain quantizer and for outputting a weighted signal as a result thereof ;
a response signal (impulse responses, impulse response, LP filter) quantizer for receiving the quantization candidates from said spectral parameter calculator for each of a plurality of subframes , and for calculating and outputting , using a stored value of a filter memory , a response signal for one subframe ;
an audio weighting circuit for receiving the inputted speech signal divided into subframes and for receiving the plurality of quantization candidates from said spectral parameter calculator , for calculating an audio weighting on the speech signal in each of the subframes , and for outputting an audio-weighted speech signal as a result thereof ;
and a subtractor for subtracting the audio-weighted speech signal from the response signal to produce a subtracted signal as a result ;
wherein said adaptive code book comprises : a delay searching and distortion calculating circuit which receives the past excitation signal on a first input terminal (LP filter excitation signal) , the subtracted signal on a second input terminal , and the impulse response on a third input terminal , and for determining the delay as a result ;
a decision circuit for receiving a plurality of distortions and corresponding delays from the delay searching and distortion calculating circuit , and for determining the delay which provides the minimum distortion between the speech signal and said pitch predictive signal ;
and a residual calculator connected to receive the delay which provides the minimum distortion from said decision circuit and for effecting pitch prediction to determine a corresponding pitch predictive signal that is output to said excitation quantizer .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (impulse response, response signal) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (impulse response, response signal) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5884252A
CLAIM 5
. An apparatus for coding a speech signal , comprising : a mode decision unit for deciding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
an adaptive code book for determining delay with respect to each of said quantization candidates , respectively , outputted from said spectral parameter quantizer , generating a pitch predective signal based on a past excitation signal for each of the delays and associating quantization candidates , and outputting a quantization candidate and a delay which provide a minimum distortion between the speech signal and said pitch predective signal , if the mode decision information outputted from said mode decision unit represents a (LP filter excitation signal) predetermined mode ;
a excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US5884252A
CLAIM 7
. An apparatus for coding a speech signal , comprising : a mode decision unit for decoding a mode of an inputted speech signal and outputting mode decision information ;
a spectral parameter calculator for determining spectral parameters from the speech signal , quantizing the spectral parameters , and outputting a plurality of quantization candidates ;
a spectral parameter and delay calculator for calculating spectral parameters and a first delay from a signal extracted from a past excitation signal for a delay and an inputted speech signal ;
a spectral parameter quantizer for quantizing the spectral parameters and outputting at least one quantization candidate ;
an adaptive codebook (sound signal, speech signal) code book for determining a second delay based on said first delay , calculating at least one second delay candidate neighboring said first delay , generating a pitch predictive signal calculated using a signal extracted from a past excitation signal for said second delay candidate and quantization candidate , for all of the combinations between each of second delay candidates and each of quantization candidates , if the mode decision information outputted from said mode decision unit represents a predetermined mode ;
an excitation quantizer for quantizing and outputting the excitation signal of said speech signal ;
and a gain quantizer for quantizing and outputting a gain of at least one of said adaptive code book and said quantized excitation signal .

US5884252A
CLAIM 17
. The apparatus according to claim 1 , further comprising : an impulse response (impulse responses, impulse response, LP filter) calculator for receiving the quantized spectral parameters and for calculating and outputting an impulse response of a weighting filter based on the quantized spectral parameters ;
a weighting signal calculator for weighting the quantized gain output by said gain quantizer and for outputting a weighted signal as a result thereof ;
a response signal (impulse responses, impulse response, LP filter) quantizer for receiving the quantization candidates from said spectral parameter calculator for each of a plurality of subframes , and for calculating and outputting , using a stored value of a filter memory , a response signal for one subframe ;
an audio weighting circuit for receiving the inputted speech signal divided into subframes and for receiving the plurality of quantization candidates from said spectral parameter calculator , for calculating an audio weighting on the speech signal in each of the subframes , and for outputting an audio-weighted speech signal as a result thereof ;
and a subtractor for subtracting the audio-weighted speech signal from the response signal to produce a subtracted signal as a result ;
wherein said adaptive code book comprises : a delay searching and distortion calculating circuit which receives the past excitation signal on a first input terminal (LP filter excitation signal) , the subtracted signal on a second input terminal , and the impulse response on a third input terminal , and for determining the delay as a result ;
a decision circuit for receiving a plurality of distortions and corresponding delays from the delay searching and distortion calculating circuit , and for determining the delay which provides the minimum distortion between the speech signal and said pitch predictive signal ;
and a residual calculator connected to receive the delay which provides the minimum distortion from said decision circuit and for effecting pitch prediction to determine a corresponding pitch predictive signal that is output to said excitation quantizer .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
EP0747882A2

Filed: 1996-05-29     Issued: 1996-12-11

Pitch delay modification during frame erasures

(Original Assignee) AT&T Corp; AT&T IPM Corp     (Current Assignee) AT&T Corp

Yair Shoham
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 3
The method of claim 2 wherein the step of incrementing comprises i (LP filter) ncrementing a number of samples representing a pitch-period .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP0747882A2
CLAIM 3
The method of claim 2 wherein the step of incrementing comprises i (LP filter) ncrementing a number of samples representing a pitch-period .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook, speech signal) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 3
The method of claim 2 wherein the step of incrementing comprises i (LP filter) ncrementing a number of samples representing a pitch-period .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook, speech signal) is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 3
The method of claim 2 wherein the step of incrementing comprises i (LP filter) ncrementing a number of samples representing a pitch-period .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
EP0747882A2
CLAIM 3
The method of claim 2 wherein the step of incrementing comprises i (LP filter) ncrementing a number of samples representing a pitch-period .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook, speech signal) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook, speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook, speech signal) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
EP0747882A2
CLAIM 1
A method for use in a speech decoder which fails to receive reliably at least a portion of each of first and second consecutive frames of compressed speech information , the speech decoder including a codebook memory for supplying a vector signal in response to a signal representing pitch-period information , the vector signal for use in generating a decoded speech signal (sound signal, speech signal, decoder determines concealment) , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and incrementing said value of said signal for use in said second frame , such that said codebook memory supplies a vector signal in response to the incremented value of said signal .

EP0747882A2
CLAIM 3
The method of claim 2 wherein the step of incrementing comprises i (LP filter) ncrementing a number of samples representing a pitch-period .

EP0747882A2
CLAIM 5
A method for use in a speech decoder which fails to receive reliably at least a portion of a frame of compressed speech information for first and second consecutive frames , the speech decoder including an adaptive codebook (sound signal, speech signal, decoder determines concealment) memory for supplying codebook vector signals for use in generating a decoded speech signal in response to a signal representing pitch-period information , the method comprising : storing a signal having a value representing pitch-period information corresponding to said first frame ;
and if said stored value does not exceed a threshold , incrementing said value of said signal for use in said second frame .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5845244A

Filed: 1996-05-13     Issued: 1998-12-01

Adapting noise masking level in analysis-by-synthesis employing perceptual weighting

(Original Assignee) France Telecom SA     (Current Assignee) Orange SA

Stephane Proust
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (successive frames) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5845244A
CLAIM 1
. Analysis-by-synthesis speech coding method , comprising the following steps : linear prediction analysis of order p of a speech signal digitized as successive frames (onset frame) in order to determine parameters defining a short-term synthesis filter ;
determination of excitation parameters defining an excitation signal to be applied to the short-term synthesis filter in order to produce a synthetic signal representative of the speech signal , some at least of the excitation parameters being determined by minimizing the energy of an error signal resulting from a filtering of a difference between the speech signal and the synthetic signal by at least one perceptual weighting filter having a transfer function of the form W(z)=A(z/ . sub . γ 1)/A(z/γ 2) where ##EQU12## the coefficients a i being linear prediction coefficients obtained in the linear prediction analysis step , and γ 1 and γ 2 denoting spectral expansion coefficients such that 0≦γ 2 ≦γ 1 ≦1 ;
and production of quantization values of the parameters defining the short-term synthesis filter and of the excitation parameters , wherein the value of at least one of the spectral expansion coefficients is adapted on the basis of spectral parameters obtained in the linear prediction analysis step .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5845244A
CLAIM 1
. Analysis-by-synthesis speech coding method , comprising the following steps : linear prediction analysis of order p of a speech signal (speech signal, decoder determines concealment) digitized as successive frames in order to determine parameters defining a short-term synthesis filter ;
determination of excitation parameters defining an excitation signal to be applied to the short-term synthesis filter in order to produce a synthetic signal representative of the speech signal , some at least of the excitation parameters being determined by minimizing the energy of an error signal resulting from a filtering of a difference between the speech signal and the synthetic signal by at least one perceptual weighting filter having a transfer function of the form W(z)=A(z/ . sub . γ 1)/A(z/γ 2) where ##EQU12## the coefficients a i being linear prediction coefficients obtained in the linear prediction analysis step , and γ 1 and γ 2 denoting spectral expansion coefficients such that 0≦γ 2 ≦γ 1 ≦1 ;
and production of quantization values of the parameters defining the short-term synthesis filter and of the excitation parameters , wherein the value of at least one of the spectral expansion coefficients is adapted on the basis of spectral parameters obtained in the linear prediction analysis step .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5845244A
CLAIM 1
. Analysis-by-synthesis speech coding method , comprising the following steps : linear prediction analysis of order p of a speech signal (speech signal, decoder determines concealment) digitized as successive frames in order to determine parameters defining a short-term synthesis filter ;
determination of excitation parameters defining an excitation signal to be applied to the short-term synthesis filter in order to produce a synthetic signal representative of the speech signal , some at least of the excitation parameters being determined by minimizing the energy of an error signal resulting from a filtering of a difference between the speech signal and the synthetic signal by at least one perceptual weighting filter having a transfer function of the form W(z)=A(z/ . sub . γ 1)/A(z/γ 2) where ##EQU12## the coefficients a i being linear prediction coefficients obtained in the linear prediction analysis step , and γ 1 and γ 2 denoting spectral expansion coefficients such that 0≦γ 2 ≦γ 1 ≦1 ;
and production of quantization values of the parameters defining the short-term synthesis filter and of the excitation parameters , wherein the value of at least one of the spectral expansion coefficients is adapted on the basis of spectral parameters obtained in the linear prediction analysis step .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5845244A
CLAIM 1
. Analysis-by-synthesis speech coding method , comprising the following steps : linear prediction analysis of order p of a speech signal (speech signal, decoder determines concealment) digitized as successive frames in order to determine parameters defining a short-term synthesis filter ;
determination of excitation parameters defining an excitation signal to be applied to the short-term synthesis filter in order to produce a synthetic signal representative of the speech signal , some at least of the excitation parameters being determined by minimizing the energy of an error signal resulting from a filtering of a difference between the speech signal and the synthetic signal by at least one perceptual weighting filter having a transfer function of the form W(z)=A(z/ . sub . γ 1)/A(z/γ 2) where ##EQU12## the coefficients a i being linear prediction coefficient (comfort noise) s obtained in the linear prediction analysis step , and γ 1 and γ 2 denoting spectral expansion coefficients such that 0≦γ 2 ≦γ 1 ≦1 ;
and production of quantization values of the parameters defining the short-term synthesis filter and of the excitation parameters , wherein the value of at least one of the spectral expansion coefficients is adapted on the basis of spectral parameters obtained in the linear prediction analysis step .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (successive frames) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5845244A
CLAIM 1
. Analysis-by-synthesis speech coding method , comprising the following steps : linear prediction analysis of order p of a speech signal digitized as successive frames (onset frame) in order to determine parameters defining a short-term synthesis filter ;
determination of excitation parameters defining an excitation signal to be applied to the short-term synthesis filter in order to produce a synthetic signal representative of the speech signal , some at least of the excitation parameters being determined by minimizing the energy of an error signal resulting from a filtering of a difference between the speech signal and the synthetic signal by at least one perceptual weighting filter having a transfer function of the form W(z)=A(z/ . sub . γ 1)/A(z/γ 2) where ##EQU12## the coefficients a i being linear prediction coefficients obtained in the linear prediction analysis step , and γ 1 and γ 2 denoting spectral expansion coefficients such that 0≦γ 2 ≦γ 1 ≦1 ;
and production of quantization values of the parameters defining the short-term synthesis filter and of the excitation parameters , wherein the value of at least one of the spectral expansion coefficients is adapted on the basis of spectral parameters obtained in the linear prediction analysis step .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5845244A
CLAIM 1
. Analysis-by-synthesis speech coding method , comprising the following steps : linear prediction analysis of order p of a speech signal (speech signal, decoder determines concealment) digitized as successive frames in order to determine parameters defining a short-term synthesis filter ;
determination of excitation parameters defining an excitation signal to be applied to the short-term synthesis filter in order to produce a synthetic signal representative of the speech signal , some at least of the excitation parameters being determined by minimizing the energy of an error signal resulting from a filtering of a difference between the speech signal and the synthetic signal by at least one perceptual weighting filter having a transfer function of the form W(z)=A(z/ . sub . γ 1)/A(z/γ 2) where ##EQU12## the coefficients a i being linear prediction coefficients obtained in the linear prediction analysis step , and γ 1 and γ 2 denoting spectral expansion coefficients such that 0≦γ 2 ≦γ 1 ≦1 ;
and production of quantization values of the parameters defining the short-term synthesis filter and of the excitation parameters , wherein the value of at least one of the spectral expansion coefficients is adapted on the basis of spectral parameters obtained in the linear prediction analysis step .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5845244A
CLAIM 1
. Analysis-by-synthesis speech coding method , comprising the following steps : linear prediction analysis of order p of a speech signal (speech signal, decoder determines concealment) digitized as successive frames in order to determine parameters defining a short-term synthesis filter ;
determination of excitation parameters defining an excitation signal to be applied to the short-term synthesis filter in order to produce a synthetic signal representative of the speech signal , some at least of the excitation parameters being determined by minimizing the energy of an error signal resulting from a filtering of a difference between the speech signal and the synthetic signal by at least one perceptual weighting filter having a transfer function of the form W(z)=A(z/ . sub . γ 1)/A(z/γ 2) where ##EQU12## the coefficients a i being linear prediction coefficients obtained in the linear prediction analysis step , and γ 1 and γ 2 denoting spectral expansion coefficients such that 0≦γ 2 ≦γ 1 ≦1 ;
and production of quantization values of the parameters defining the short-term synthesis filter and of the excitation parameters , wherein the value of at least one of the spectral expansion coefficients is adapted on the basis of spectral parameters obtained in the linear prediction analysis step .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5845244A
CLAIM 1
. Analysis-by-synthesis speech coding method , comprising the following steps : linear prediction analysis of order p of a speech signal (speech signal, decoder determines concealment) digitized as successive frames in order to determine parameters defining a short-term synthesis filter ;
determination of excitation parameters defining an excitation signal to be applied to the short-term synthesis filter in order to produce a synthetic signal representative of the speech signal , some at least of the excitation parameters being determined by minimizing the energy of an error signal resulting from a filtering of a difference between the speech signal and the synthetic signal by at least one perceptual weighting filter having a transfer function of the form W(z)=A(z/ . sub . γ 1)/A(z/γ 2) where ##EQU12## the coefficients a i being linear prediction coefficient (comfort noise) s obtained in the linear prediction analysis step , and γ 1 and γ 2 denoting spectral expansion coefficients such that 0≦γ 2 ≦γ 1 ≦1 ;
and production of quantization values of the parameters defining the short-term synthesis filter and of the excitation parameters , wherein the value of at least one of the spectral expansion coefficients is adapted on the basis of spectral parameters obtained in the linear prediction analysis step .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5845244A
CLAIM 1
. Analysis-by-synthesis speech coding method , comprising the following steps : linear prediction analysis of order p of a speech signal (speech signal, decoder determines concealment) digitized as successive frames in order to determine parameters defining a short-term synthesis filter ;
determination of excitation parameters defining an excitation signal to be applied to the short-term synthesis filter in order to produce a synthetic signal representative of the speech signal , some at least of the excitation parameters being determined by minimizing the energy of an error signal resulting from a filtering of a difference between the speech signal and the synthetic signal by at least one perceptual weighting filter having a transfer function of the form W(z)=A(z/ . sub . γ 1)/A(z/γ 2) where ##EQU12## the coefficients a i being linear prediction coefficients obtained in the linear prediction analysis step , and γ 1 and γ 2 denoting spectral expansion coefficients such that 0≦γ 2 ≦γ 1 ≦1 ;
and production of quantization values of the parameters defining the short-term synthesis filter and of the excitation parameters , wherein the value of at least one of the spectral expansion coefficients is adapted on the basis of spectral parameters obtained in the linear prediction analysis step .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH09281996A

Filed: 1996-04-15     Issued: 1997-10-31

有声音/無声音判定方法及び装置、並びに音声符号化方法

(Original Assignee) Sony Corp; ソニー株式会社     

Kazuyuki Iijima, Atsushi Matsumoto, Masayuki Nishiguchi, Shiro Omori, 士郎 大森, 淳 松本, 正之 西口, 和幸 飯島
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (の少なくとも1) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (入力音) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

JPH09281996A
CLAIM 3
【請求項3】 上記有声音/無声音判定のためのパラメ ータとして、入力音声信号のフレーム平均エネルギ、正 規化自己相関ピーク値、スペクトル類似度、零交叉数、 及びピッチ周期の少なくとも1 (conducting frame erasure concealment) つを用いることを特徴と する請求項1記載の有声音/無声音判定方法。

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (の少なくとも1) and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

JPH09281996A
CLAIM 3
【請求項3】 上記有声音/無声音判定のためのパラメ ータとして、入力音声信号のフレーム平均エネルギ、正 規化自己相関ピーク値、スペクトル類似度、零交叉数、 及びピッチ周期の少なくとも1 (conducting frame erasure concealment) つを用いることを特徴と する請求項1記載の有声音/無声音判定方法。

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (の少なくとも1) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

JPH09281996A
CLAIM 3
【請求項3】 上記有声音/無声音判定のためのパラメ ータとして、入力音声信号のフレーム平均エネルギ、正 規化自己相関ピーク値、スペクトル類似度、零交叉数、 及びピッチ周期の少なくとも1 (conducting frame erasure concealment) つを用いることを特徴と する請求項1記載の有声音/無声音判定方法。

JPH09281996A
CLAIM 5
【請求項5】 入力音声信号が有声音か無声音かを判定 する有声音/無声音判定装置において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換して関数出 力値を得る関数計算手段と、 この関数計算手段により上記シグモイド関数g(x)に基 づいて得られた値を用いて有声音/無声音判定を行う手 段とを有すること (maximum amplitude) を特徴とする有声音/無声音判定装 置。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (の少なくとも1) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

JPH09281996A
CLAIM 3
【請求項3】 上記有声音/無声音判定のためのパラメ ータとして、入力音声信号のフレーム平均エネルギ、正 規化自己相関ピーク値、スペクトル類似度、零交叉数、 及びピッチ周期の少なくとも1 (conducting frame erasure concealment) つを用いることを特徴と する請求項1記載の有声音/無声音判定方法。

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (の少なくとも1) and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

JPH09281996A
CLAIM 3
【請求項3】 上記有声音/無声音判定のためのパラメ ータとして、入力音声信号のフレーム平均エネルギ、正 規化自己相関ピーク値、スペクトル類似度、零交叉数、 及びピッチ周期の少なくとも1 (conducting frame erasure concealment) つを用いることを特徴と する請求項1記載の有声音/無声音判定方法。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment (の少なくとも1) and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

JPH09281996A
CLAIM 3
【請求項3】 上記有声音/無声音判定のためのパラメ ータとして、入力音声信号のフレーム平均エネルギ、正 規化自己相関ピーク値、スペクトル類似度、零交叉数、 及びピッチ周期の少なくとも1 (conducting frame erasure concealment) つを用いることを特徴と する請求項1記載の有声音/無声音判定方法。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment (の少なくとも1) and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

JPH09281996A
CLAIM 3
【請求項3】 上記有声音/無声音判定のためのパラメ ータとして、入力音声信号のフレーム平均エネルギ、正 規化自己相関ピーク値、スペクトル類似度、零交叉数、 及びピッチ周期の少なくとも1 (conducting frame erasure concealment) つを用いることを特徴と する請求項1記載の有声音/無声音判定方法。

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

JPH09281996A
CLAIM 5
【請求項5】 入力音声信号が有声音か無声音かを判定 する有声音/無声音判定装置において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換して関数出 力値を得る関数計算手段と、 この関数計算手段により上記シグモイド関数g(x)に基 づいて得られた値を用いて有声音/無声音判定を行う手 段とを有すること (maximum amplitude) を特徴とする有声音/無声音判定装 置。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment (の少なくとも1) and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

JPH09281996A
CLAIM 3
【請求項3】 上記有声音/無声音判定のためのパラメ ータとして、入力音声信号のフレーム平均エネルギ、正 規化自己相関ピーク値、スペクトル類似度、零交叉数、 及びピッチ周期の少なくとも1 (conducting frame erasure concealment) つを用いることを特徴と する請求項1記載の有声音/無声音判定方法。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment (の少なくとも1) and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (入力音) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

JPH09281996A
CLAIM 3
【請求項3】 上記有声音/無声音判定のためのパラメ ータとして、入力音声信号のフレーム平均エネルギ、正 規化自己相関ピーク値、スペクトル類似度、零交叉数、 及びピッチ周期の少なくとも1 (conducting frame erasure concealment) つを用いることを特徴と する請求項1記載の有声音/無声音判定方法。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

JPH09281996A
CLAIM 5
【請求項5】 入力音声信号が有声音か無声音かを判定 する有声音/無声音判定装置において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換して関数出 力値を得る関数計算手段と、 この関数計算手段により上記シグモイド関数g(x)に基 づいて得られた値を用いて有声音/無声音判定を行う手 段とを有すること (maximum amplitude) を特徴とする有声音/無声音判定装 置。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment (の少なくとも1) and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

JPH09281996A
CLAIM 3
【請求項3】 上記有声音/無声音判定のためのパラメ ータとして、入力音声信号のフレーム平均エネルギ、正 規化自己相関ピーク値、スペクトル類似度、零交叉数、 及びピッチ周期の少なくとも1 (conducting frame erasure concealment) つを用いることを特徴と する請求項1記載の有声音/無声音判定方法。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment (の少なくとも1) and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

JPH09281996A
CLAIM 3
【請求項3】 上記有声音/無声音判定のためのパラメ ータとして、入力音声信号のフレーム平均エネルギ、正 規化自己相関ピーク値、スペクトル類似度、零交叉数、 及びピッチ周期の少なくとも1 (conducting frame erasure concealment) つを用いることを特徴と する請求項1記載の有声音/無声音判定方法。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (音声信号) is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

JPH09281996A
CLAIM 5
【請求項5】 入力音声信号が有声音か無声音かを判定 する有声音/無声音判定装置において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換して関数出 力値を得る関数計算手段と、 この関数計算手段により上記シグモイド関数g(x)に基 づいて得られた値を用いて有声音/無声音判定を行う手 段とを有すること (maximum amplitude) を特徴とする有声音/無声音判定装 置。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (音声信号) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声信号) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (音声信号) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment (の少なくとも1) and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JPH09281996A
CLAIM 1
【請求項1】 入力音声信号 (sound signal, speech signal) が有声音か無声音かを判定 する有声音/無声音判定方法において、 入力音声信号に関する有声音/無声音判定のためのパラ メータxを、 g(x) = A/(1+ exp(−(x−b)/a)) ただし、A,a,bは定数 で表されるシグモイド関数g(x)により変換し、このシ グモイド関数g(x)により変換されたパラメータを用い て有声音/無声音判定を行うことを特徴とする有声音/ 無声音判定方法。

JPH09281996A
CLAIM 3
【請求項3】 上記有声音/無声音判定のためのパラメ ータとして、入力音声信号のフレーム平均エネルギ、正 規化自己相関ピーク値、スペクトル類似度、零交叉数、 及びピッチ周期の少なくとも1 (conducting frame erasure concealment) つを用いることを特徴と する請求項1記載の有声音/無声音判定方法。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5778335A

Filed: 1996-02-26     Issued: 1998-07-07

Method and apparatus for efficient multiband celp wideband speech and music coding and decoding

(Original Assignee) University of California     (Current Assignee) University of California

Anil Wamanrao Ubale, Allen Gersho
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficient (comfort noise) s for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficient (comfort noise) s for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5778335A
CLAIM 1
. A method for encoding and decoding sound , comprising the steps of : analyzing an input waveform and computing the linear prediction coefficients for a portion of the input waveform ;
classifying the input waveform as one of a group comprising speech and music ;
generating a first plurality of codebooks , each having an output , where each codebook is associated with a frequency band ;
generating at least one first adaptive codebook (sound signal, speech signal) having an output ;
coupling the output of the first plurality of codebooks and the output of the at least one first adaptive codebook together to create a composite waveform ;
synthesis filtering the composite waveform ;
perceptually weighting the input waveform ;
perceptually weighting the synthesis filtered composite waveform ;
differencing the perceptually weighted synthesis filtered composite waveform from the perceptually weighted input waveform to form an output waveform ;
searching through the first plurality of codebooks and the adaptive codebook to minimize the errors in the output waveform ;
and decoding the output waveform using a second plurality of codebooks and at least one second adaptive codebook .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5717822A

Filed: 1996-02-16     Issued: 1998-02-10

Computational complexity reduction during frame erasure of packet loss

(Original Assignee) Nokia of America Corp     (Current Assignee) Nokia of America Corp

Juin-Hwey Chen
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (signal samples) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5717822A
CLAIM 7
. The method of claim 5 wherein the one or more signal processing operations for storing signals further comprises : generating a first signal representing a root-mean-square of a set of synthesized excitation signal samples (impulse responses) ;
generating a second signal representing the logarithm of the first signal ;
and generating the signal reflecting a synthesized excitation signal by forming a difference between the second signal and a constant signal stored in memory .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5717822A
CLAIM 10
. The method of claim 9 wherein said one or more signal processing operations further comprises : generating eleven autocorrelation coefficient signals based on the stored signal from the LPC synthesis filter ;
and generating tenth order linear prediction coefficient (comfort noise) s based on said autocorrelation coefficients .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (autocorrelation coefficients) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5717822A
CLAIM 10
. The method of claim 9 wherein said one or more signal processing operations further comprises : generating eleven autocorrelation coefficient signals based on the stored signal from the LPC synthesis filter ;
and generating tenth order linear prediction coefficients based on said autocorrelation coefficients (LP filter, LP filter excitation signal) .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (autocorrelation coefficients) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5717822A
CLAIM 10
. The method of claim 9 wherein said one or more signal processing operations further comprises : generating eleven autocorrelation coefficient signals based on the stored signal from the LPC synthesis filter ;
and generating tenth order linear prediction coefficients based on said autocorrelation coefficients (LP filter, LP filter excitation signal) .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (autocorrelation coefficients) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5717822A
CLAIM 10
. The method of claim 9 wherein said one or more signal processing operations further comprises : generating eleven autocorrelation coefficient signals based on the stored signal from the LPC synthesis filter ;
and generating tenth order linear prediction coefficients based on said autocorrelation coefficients (LP filter, LP filter excitation signal) .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (signal samples) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5717822A
CLAIM 7
. The method of claim 5 wherein the one or more signal processing operations for storing signals further comprises : generating a first signal representing a root-mean-square of a set of synthesized excitation signal samples (impulse responses) ;
generating a second signal representing the logarithm of the first signal ;
and generating the signal reflecting a synthesized excitation signal by forming a difference between the second signal and a constant signal stored in memory .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5717822A
CLAIM 10
. The method of claim 9 wherein said one or more signal processing operations further comprises : generating eleven autocorrelation coefficient signals based on the stored signal from the LPC synthesis filter ;
and generating tenth order linear prediction coefficient (comfort noise) s based on said autocorrelation coefficients .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (autocorrelation coefficients) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5717822A
CLAIM 10
. The method of claim 9 wherein said one or more signal processing operations further comprises : generating eleven autocorrelation coefficient signals based on the stored signal from the LPC synthesis filter ;
and generating tenth order linear prediction coefficients based on said autocorrelation coefficients (LP filter, LP filter excitation signal) .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (autocorrelation coefficients) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5717822A
CLAIM 10
. The method of claim 9 wherein said one or more signal processing operations further comprises : generating eleven autocorrelation coefficient signals based on the stored signal from the LPC synthesis filter ;
and generating tenth order linear prediction coefficients based on said autocorrelation coefficients (LP filter, LP filter excitation signal) .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (autocorrelation coefficients) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5717822A
CLAIM 10
. The method of claim 9 wherein said one or more signal processing operations further comprises : generating eleven autocorrelation coefficient signals based on the stored signal from the LPC synthesis filter ;
and generating tenth order linear prediction coefficients based on said autocorrelation coefficients (LP filter, LP filter excitation signal) .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US6006175A

Filed: 1996-02-06     Issued: 1999-12-21

Methods and apparatus for non-acoustic speech characterization and recognition

(Original Assignee) University of California     (Current Assignee) Lawrence Livermore National Security LLC

John F. Holzrichter
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response (sampling time) (sampling time) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe (end time) affected by the artificial construction of the periodic part .
US6006175A
CLAIM 4
. The method of claim 3 further comprising measuring acoustic pressure or sound intensity over a plurality of sampling time (first impulse response, impulse response) s to obtain amplitude vs . time , frequency , zero crossing times , energy per time interval , and LPC or cepstral coefficients of acoustic speech .

US6006175A
CLAIM 15
. The method of claim 14 further comprising storing , in the feature vector , the start time , duration time , and end time (last subframe) of the defined time frame of each feature vector .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter , and a phase information parameter (boundary condition) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter , and a phase information parameter (boundary condition) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter , and a phase information parameter (boundary condition) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter , and a phase information parameter (boundary condition) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter , and a phase information parameter (boundary condition) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (sampling time) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6006175A
CLAIM 4
. The method of claim 3 further comprising measuring acoustic pressure or sound intensity over a plurality of sampling time (first impulse response, impulse response) s to obtain amplitude vs . time , frequency , zero crossing times , energy per time interval , and LPC or cepstral coefficients of acoustic speech .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (sampling time) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 4
. The method of claim 3 further comprising measuring acoustic pressure or sound intensity over a plurality of sampling time (first impulse response, impulse response) s to obtain amplitude vs . time , frequency , zero crossing times , energy per time interval , and LPC or cepstral coefficients of acoustic speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response (sampling time) (sampling time) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe (end time) affected by the artificial construction of the periodic part .
US6006175A
CLAIM 4
. The method of claim 3 further comprising measuring acoustic pressure or sound intensity over a plurality of sampling time (first impulse response, impulse response) s to obtain amplitude vs . time , frequency , zero crossing times , energy per time interval , and LPC or cepstral coefficients of acoustic speech .

US6006175A
CLAIM 15
. The method of claim 14 further comprising storing , in the feature vector , the start time , duration time , and end time (last subframe) of the defined time frame of each feature vector .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (sampling time) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US6006175A
CLAIM 4
. The method of claim 3 further comprising measuring acoustic pressure or sound intensity over a plurality of sampling time (first impulse response, impulse response) s to obtain amplitude vs . time , frequency , zero crossing times , energy per time interval , and LPC or cepstral coefficients of acoustic speech .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (normal sound) , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (sampling time) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US6006175A
CLAIM 2
. The method of claim 1 wherein the speech is selected from normal sound (signal classification parameter) ed speech , whispered speech , and non-sounded speech .

US6006175A
CLAIM 4
. The method of claim 3 further comprising measuring acoustic pressure or sound intensity over a plurality of sampling time (first impulse response, impulse response) s to obtain amplitude vs . time , frequency , zero crossing times , energy per time interval , and LPC or cepstral coefficients of acoustic speech .

US6006175A
CLAIM 39
. The method of claim 1 further comprising measuring organ contact as one organ touches another and strongly changes the EM wave reflecting condition because of changing resonator or boundary condition (phase information parameter) effects .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH09185397A

Filed: 1995-12-28     Issued: 1997-07-15

音声情報記録装置

(Original Assignee) Olympus Optical Co Ltd; オリンパス光学工業株式会社     

秀享 ▲高▼橋, Hideyuki Takahashi
US7693710B2
CLAIM 1
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 2
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 3
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 4
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (音声情報, する記録) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JPH09185397A
CLAIM 1
【請求項1】 ビットレートの異なる複数の音声符号化 手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 入力された信号が有音か無音かを判別する音声判別手段 と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 (speech signal) 手段と、 選択手段により別の音声符号化手段が選択されると、音 声判別手段における判別結果が無音と判別されるまで同 一の音声符号化手段による符号化を継続し、無音と判別 された時点で別の音声符号化手段に切り替える制御手段 と、 を具備したことを特徴とする音声情報 (speech signal) 記録装置。

JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 5
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (音声情報, する記録) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (線形予測符号化) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
JPH09185397A
CLAIM 1
【請求項1】 ビットレートの異なる複数の音声符号化 手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 入力された信号が有音か無音かを判別する音声判別手段 と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 (speech signal) 手段と、 選択手段により別の音声符号化手段が選択されると、音 声判別手段における判別結果が無音と判別されるまで同 一の音声符号化手段による符号化を継続し、無音と判別 された時点で別の音声符号化手段に切り替える制御手段 と、 を具備したことを特徴とする音声情報 (speech signal) 記録装置。

JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (音声情報, する記録) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (線形予測符号化) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH09185397A
CLAIM 1
【請求項1】 ビットレートの異なる複数の音声符号化 手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 入力された信号が有音か無音かを判別する音声判別手段 と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 (speech signal) 手段と、 選択手段により別の音声符号化手段が選択されると、音 声判別手段における判別結果が無音と判別されるまで同 一の音声符号化手段による符号化を継続し、無音と判別 された時点で別の音声符号化手段に切り替える制御手段 と、 を具備したことを特徴とする音声情報 (speech signal) 記録装置。

JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 8
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (線形予測符号化) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 10
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 11
. A method of concealing frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (線形予測符号化) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (音声情報, する記録) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH09185397A
CLAIM 1
【請求項1】 ビットレートの異なる複数の音声符号化 手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 入力された信号が有音か無音かを判別する音声判別手段 と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 (speech signal) 手段と、 選択手段により別の音声符号化手段が選択されると、音 声判別手段における判別結果が無音と判別されるまで同 一の音声符号化手段による符号化を継続し、無音と判別 された時点で別の音声符号化手段に切り替える制御手段 と、 を具備したことを特徴とする音声情報 (speech signal) 記録装置。

JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (音声情報, する記録) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (線形予測符号化) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
JPH09185397A
CLAIM 1
【請求項1】 ビットレートの異なる複数の音声符号化 手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 入力された信号が有音か無音かを判別する音声判別手段 と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 (speech signal) 手段と、 選択手段により別の音声符号化手段が選択されると、音 声判別手段における判別結果が無音と判別されるまで同 一の音声符号化手段による符号化を継続し、無音と判別 された時点で別の音声符号化手段に切り替える制御手段 と、 を具備したことを特徴とする音声情報 (speech signal) 記録装置。

JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (音声情報, する記録) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (線形予測符号化) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
JPH09185397A
CLAIM 1
【請求項1】 ビットレートの異なる複数の音声符号化 手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 入力された信号が有音か無音かを判別する音声判別手段 と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 (speech signal) 手段と、 選択手段により別の音声符号化手段が選択されると、音 声判別手段における判別結果が無音と判別されるまで同 一の音声符号化手段による符号化を継続し、無音と判別 された時点で別の音声符号化手段に切り替える制御手段 と、 を具備したことを特徴とする音声情報 (speech signal) 記録装置。

JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (線形予測符号化) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (線形予測符号化) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (音声情報, する記録) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH09185397A
CLAIM 1
【請求項1】 ビットレートの異なる複数の音声符号化 手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 入力された信号が有音か無音かを判別する音声判別手段 と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 (speech signal) 手段と、 選択手段により別の音声符号化手段が選択されると、音 声判別手段における判別結果が無音と判別されるまで同 一の音声符号化手段による符号化を継続し、無音と判別 された時点で別の音声符号化手段に切り替える制御手段 と、 を具備したことを特徴とする音声情報 (speech signal) 記録装置。

JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (線形予測符号化) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
JPH09185397A
CLAIM 3
【請求項3】 過去の駆動音源信号から作成される適応 コードブックを用いて、マルチパルス駆動線形予測符号 化またはコード駆動線形予測符号化 (frame erasure, frame erasure concealment, conducting frame erasure concealment, decoder conducts frame erasure concealment) を行なう、ビットレ ートの異なる複数の音声符号化手段と、 複数の音声符号化手段から任意の音声符号化手段を選択 する選択手段と、 選択手段から得られる符号化選択データと、音声符号化 手段から得られる符号化データをメモリに記録する記録 手段と、 選択手段により別の音声符号化手段が選択されると、一 旦合成フィルタの内容および適応コードブックの内容を クリアする制御手段と、 を具備したことを特徴とする音声情報記録装置。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5819217A

Filed: 1995-12-21     Issued: 1998-10-06

Method and system for differentiating between speech and noise

(Original Assignee) Bell Atlantic Science and Technology Inc     (Current Assignee) Verizon Patent and Licensing Inc

Vijay Rangan Raman
US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (average energy) per sample for other frames .
US5819217A
CLAIM 7
. In a signal processing system , a method for identifying background noise in a signal containing speech and noise , comprising the steps of a) separating the signal into frames , b) evaluating energy levels of a segment comprising at least three adjacent frames , c) calculating a difference value between the last of the adjacent frames and the average energy (average energy) level of the segment , and d) identifying the last frame as noise if the difference value is less than a predetermined amount .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (last frame) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5819217A
CLAIM 7
. In a signal processing system , a method for identifying background noise in a signal containing speech and noise , comprising the steps of a) separating the signal into frames , b) evaluating energy levels of a segment comprising at least three adjacent frames , c) calculating a difference value between the last of the adjacent frames and the average energy level of the segment , and d) identifying the last frame (last frame) as noise if the difference value is less than a predetermined amount .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (last frame) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5819217A
CLAIM 7
. In a signal processing system , a method for identifying background noise in a signal containing speech and noise , comprising the steps of a) separating the signal into frames , b) evaluating energy levels of a segment comprising at least three adjacent frames , c) calculating a difference value between the last of the adjacent frames and the average energy level of the segment , and d) identifying the last frame (last frame) as noise if the difference value is less than a predetermined amount .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (last frame) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5819217A
CLAIM 7
. In a signal processing system , a method for identifying background noise in a signal containing speech and noise , comprising the steps of a) separating the signal into frames , b) evaluating energy levels of a segment comprising at least three adjacent frames , c) calculating a difference value between the last of the adjacent frames and the average energy level of the segment , and d) identifying the last frame (last frame) as noise if the difference value is less than a predetermined amount .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (average energy) per sample for other frames .
US5819217A
CLAIM 7
. In a signal processing system , a method for identifying background noise in a signal containing speech and noise , comprising the steps of a) separating the signal into frames , b) evaluating energy levels of a segment comprising at least three adjacent frames , c) calculating a difference value between the last of the adjacent frames and the average energy (average energy) level of the segment , and d) identifying the last frame as noise if the difference value is less than a predetermined amount .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (last frame) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5819217A
CLAIM 7
. In a signal processing system , a method for identifying background noise in a signal containing speech and noise , comprising the steps of a) separating the signal into frames , b) evaluating energy levels of a segment comprising at least three adjacent frames , c) calculating a difference value between the last of the adjacent frames and the average energy level of the segment , and d) identifying the last frame (last frame) as noise if the difference value is less than a predetermined amount .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (last frame) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5819217A
CLAIM 7
. In a signal processing system , a method for identifying background noise in a signal containing speech and noise , comprising the steps of a) separating the signal into frames , b) evaluating energy levels of a segment comprising at least three adjacent frames , c) calculating a difference value between the last of the adjacent frames and the average energy level of the segment , and d) identifying the last frame (last frame) as noise if the difference value is less than a predetermined amount .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (average energy) per sample for other frames .
US5819217A
CLAIM 7
. In a signal processing system , a method for identifying background noise in a signal containing speech and noise , comprising the steps of a) separating the signal into frames , b) evaluating energy levels of a segment comprising at least three adjacent frames , c) calculating a difference value between the last of the adjacent frames and the average energy (average energy) level of the segment , and d) identifying the last frame as noise if the difference value is less than a predetermined amount .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (last frame) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5819217A
CLAIM 7
. In a signal processing system , a method for identifying background noise in a signal containing speech and noise , comprising the steps of a) separating the signal into frames , b) evaluating energy levels of a segment comprising at least three adjacent frames , c) calculating a difference value between the last of the adjacent frames and the average energy level of the segment , and d) identifying the last frame (last frame) as noise if the difference value is less than a predetermined amount .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5673363A

Filed: 1995-12-20     Issued: 1997-09-30

Error concealment method and apparatus of audio signals

(Original Assignee) Samsung Electronics Co Ltd     (Current Assignee) Samsung Electronics Co Ltd

Byeungwoo Jeon, Jechang Jeong
US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5673363A
CLAIM 10
. The method according to claim 1 , wherein said step (e) replaces said predetermined weight values with a value of zero to provide audio muting , when the number of the succeeding frames where errors have occurred is larger than or equal to a predetermined number (last non) .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5673363A
CLAIM 10
. The method according to claim 1 , wherein said step (e) replaces said predetermined weight values with a value of zero to provide audio muting , when the number of the succeeding frames where errors have occurred is larger than or equal to a predetermined number (last non) .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US5673363A
CLAIM 10
. The method according to claim 1 , wherein said step (e) replaces said predetermined weight values with a value of zero to provide audio muting , when the number of the succeeding frames where errors have occurred is larger than or equal to a predetermined number (last non) .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5673363A
CLAIM 10
. The method according to claim 1 , wherein said step (e) replaces said predetermined weight values with a value of zero to provide audio muting , when the number of the succeeding frames where errors have occurred is larger than or equal to a predetermined number (last non) .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5673363A
CLAIM 10
. The method according to claim 1 , wherein said step (e) replaces said predetermined weight values with a value of zero to provide audio muting , when the number of the succeeding frames where errors have occurred is larger than or equal to a predetermined number (last non) .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5673363A
CLAIM 2
. The method according to claim 1 , wherein said step (e) uses frequency coefficients subbands which exist in the immediate preceding frame (signal classification parameter) of the error frame(s) and is adjacent to the error frame(s) .

US5673363A
CLAIM 10
. The method according to claim 1 , wherein said step (e) replaces said predetermined weight values with a value of zero to provide audio muting , when the number of the succeeding frames where errors have occurred is larger than or equal to a predetermined number (last non) .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5745871A

Filed: 1995-11-29     Issued: 1998-04-28

Pitch period estimation for use with audio coders

(Original Assignee) Nokia of America Corp     (Current Assignee) Nokia of America Corp

Juin-Hwey Chen
US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signals to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signals to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signal (speech signal, decoder determines concealment) s to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signals to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signal (speech signal, decoder determines concealment) s to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signal (speech signal, decoder determines concealment) s to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signals to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signals to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signals to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signals to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signals to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signals to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signal (speech signal, decoder determines concealment) s to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signals to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signal (speech signal, decoder determines concealment) s to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signal (speech signal, decoder determines concealment) s to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signals to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signals to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signals to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signal (speech signal, decoder determines concealment) s to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5745871A
CLAIM 1
. A method of quantizing a pitch period signal relating to the pitch period for one of a sequence of frames of sampled input speech signals to one of a plurality , N , of quantizing values comprising the steps of determining whether said one frame of input speech signals corresponds to voiced speech or to other than voiced speech , when said one frame of input speech signals corresponds to other than voiced speech assigning a particular non-zero one of said N quantizing values to said pitch period signal , said non-zero quantizing value comprising a bias value for said pitch period , when said one frame of input speech signals corresponds to a voiced speech signal , extracting from said one frame of input speech signals a first signal representative of the pitch period for said one frame of input speech signals , generating a prediction signal corresponding to a prediction of the pitch period for said one frame based on the value of the pitch period signal for at least one preceding frame (signal classification parameter) of sampled input speech signals , comparing the value of said first signal with the value of said prediction signal to form a difference signal , and assigning a value , other than the bias value , to said pitch period signal for said one frame based on said difference signal .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5799276A

Filed: 1995-11-07     Issued: 1998-08-25

Knowledge-based speech recognition system and methods having frame length computed based upon estimated pitch period of vocalic intervals

(Original Assignee) Accent Inc     (Current Assignee) Rosetta Stone Ltd

Edward Komissarchik, Vladimir Arlazarov, Dimitri Bogdanov, Yuri Finkelstein, Andrey Ivanov, Jacob Kaminsky, Julia Komissarchik, Olga Krivnova, Mikhail Kronrod, Mikhail Malkovsky, Maxim Paklin, Alexander Rozanov, Vladimir Segal, Nina Zinovieva
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (pitch period) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5799276A
CLAIM 1
. A knowledge-based speech recognition system for recognizing an input speech signal comprising vocalic and non-vocalic intervals , each of the vocalic intervals having a pitch period (decoder concealment, pitch period, decoder determines concealment) , the system comprising : means for capturing the input speech signal ;
means for segmenting the input speech signal into a series of segments including vocalic intervals and non-vocalic intervals , the vocalic intervals having a frame length computed based on an estimation of the pitch period of the vocalic intervals of the input speech signal ;
means for characterizing the series of segments based upon acoustic events detected within the input speech signal to obtain an acoustic feature vector ;
means for storing a dictionary having a multiplicity of words , each one of the multiplicity of words described by a phonetic transcription and at least one acoustic event transcription ;
and means for selecting a word choice by comparing the acoustic feature vector to the acoustic event transcriptions of the multiplicity of words .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (pitch period) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5799276A
CLAIM 1
. A knowledge-based speech recognition system for recognizing an input speech signal comprising vocalic and non-vocalic intervals , each of the vocalic intervals having a pitch period (decoder concealment, pitch period, decoder determines concealment) , the system comprising : means for capturing the input speech signal ;
means for segmenting the input speech signal into a series of segments including vocalic intervals and non-vocalic intervals , the vocalic intervals having a frame length computed based on an estimation of the pitch period of the vocalic intervals of the input speech signal ;
means for characterizing the series of segments based upon acoustic events detected within the input speech signal to obtain an acoustic feature vector ;
means for storing a dictionary having a multiplicity of words , each one of the multiplicity of words described by a phonetic transcription and at least one acoustic event transcription ;
and means for selecting a word choice by comparing the acoustic feature vector to the acoustic event transcriptions of the multiplicity of words .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (acoustic characteristics) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5799276A
CLAIM 24
. A knowledge-based speech recognition system for recognizing an input speech signal , the input speech signal including a sequence of phonemes , the system comprising : means for capturing the input speech signal ;
means for segmenting the input speech signal into a series of segments including vocalic intervals and non-vocalic intervals , the vocalic intervals having a frame length computed based on an estimation of the pitch period of the vocalic intervals of the input speech signal , the series of segments approximately corresponding to the sequence of phonemes ;
means for characterizing the acoustic and spectral characteristics of the input speech signal : means for storing a dictionary having a multiplicity of words , each one of the multiplicity of words described by a series of acoustic events , spectral characteristics and a series of grammatical , semantic and syntactic attributes ;
and means for selecting a word choice from amongst the multiplicity of words by weighing , for one or more of the multiplicity of words , correspondence between the acoustic events and the acoustic characteristics (speech signal) , between the spectral characteristics of the words and the input speech signal , and the grammatical , semantic and syntactic attributes of the words .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (acoustic characteristics) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5799276A
CLAIM 24
. A knowledge-based speech recognition system for recognizing an input speech signal , the input speech signal including a sequence of phonemes , the system comprising : means for capturing the input speech signal ;
means for segmenting the input speech signal into a series of segments including vocalic intervals and non-vocalic intervals , the vocalic intervals having a frame length computed based on an estimation of the pitch period of the vocalic intervals of the input speech signal , the series of segments approximately corresponding to the sequence of phonemes ;
means for characterizing the acoustic and spectral characteristics of the input speech signal : means for storing a dictionary having a multiplicity of words , each one of the multiplicity of words described by a series of acoustic events , spectral characteristics and a series of grammatical , semantic and syntactic attributes ;
and means for selecting a word choice from amongst the multiplicity of words by weighing , for one or more of the multiplicity of words , correspondence between the acoustic events and the acoustic characteristics (speech signal) , between the spectral characteristics of the words and the input speech signal , and the grammatical , semantic and syntactic attributes of the words .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (acoustic characteristics) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5799276A
CLAIM 24
. A knowledge-based speech recognition system for recognizing an input speech signal , the input speech signal including a sequence of phonemes , the system comprising : means for capturing the input speech signal ;
means for segmenting the input speech signal into a series of segments including vocalic intervals and non-vocalic intervals , the vocalic intervals having a frame length computed based on an estimation of the pitch period of the vocalic intervals of the input speech signal , the series of segments approximately corresponding to the sequence of phonemes ;
means for characterizing the acoustic and spectral characteristics of the input speech signal : means for storing a dictionary having a multiplicity of words , each one of the multiplicity of words described by a series of acoustic events , spectral characteristics and a series of grammatical , semantic and syntactic attributes ;
and means for selecting a word choice from amongst the multiplicity of words by weighing , for one or more of the multiplicity of words , correspondence between the acoustic events and the acoustic characteristics (speech signal) , between the spectral characteristics of the words and the input speech signal , and the grammatical , semantic and syntactic attributes of the words .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (pitch period) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5799276A
CLAIM 1
. A knowledge-based speech recognition system for recognizing an input speech signal comprising vocalic and non-vocalic intervals , each of the vocalic intervals having a pitch period (decoder concealment, pitch period, decoder determines concealment) , the system comprising : means for capturing the input speech signal ;
means for segmenting the input speech signal into a series of segments including vocalic intervals and non-vocalic intervals , the vocalic intervals having a frame length computed based on an estimation of the pitch period of the vocalic intervals of the input speech signal ;
means for characterizing the series of segments based upon acoustic events detected within the input speech signal to obtain an acoustic feature vector ;
means for storing a dictionary having a multiplicity of words , each one of the multiplicity of words described by a phonetic transcription and at least one acoustic event transcription ;
and means for selecting a word choice by comparing the acoustic feature vector to the acoustic event transcriptions of the multiplicity of words .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (pitch period) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5799276A
CLAIM 1
. A knowledge-based speech recognition system for recognizing an input speech signal comprising vocalic and non-vocalic intervals , each of the vocalic intervals having a pitch period (decoder concealment, pitch period, decoder determines concealment) , the system comprising : means for capturing the input speech signal ;
means for segmenting the input speech signal into a series of segments including vocalic intervals and non-vocalic intervals , the vocalic intervals having a frame length computed based on an estimation of the pitch period of the vocalic intervals of the input speech signal ;
means for characterizing the series of segments based upon acoustic events detected within the input speech signal to obtain an acoustic feature vector ;
means for storing a dictionary having a multiplicity of words , each one of the multiplicity of words described by a phonetic transcription and at least one acoustic event transcription ;
and means for selecting a word choice by comparing the acoustic feature vector to the acoustic event transcriptions of the multiplicity of words .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (pitch period) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5799276A
CLAIM 1
. A knowledge-based speech recognition system for recognizing an input speech signal comprising vocalic and non-vocalic intervals , each of the vocalic intervals having a pitch period (decoder concealment, pitch period, decoder determines concealment) , the system comprising : means for capturing the input speech signal ;
means for segmenting the input speech signal into a series of segments including vocalic intervals and non-vocalic intervals , the vocalic intervals having a frame length computed based on an estimation of the pitch period of the vocalic intervals of the input speech signal ;
means for characterizing the series of segments based upon acoustic events detected within the input speech signal to obtain an acoustic feature vector ;
means for storing a dictionary having a multiplicity of words , each one of the multiplicity of words described by a phonetic transcription and at least one acoustic event transcription ;
and means for selecting a word choice by comparing the acoustic feature vector to the acoustic event transcriptions of the multiplicity of words .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (acoustic characteristics) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5799276A
CLAIM 24
. A knowledge-based speech recognition system for recognizing an input speech signal , the input speech signal including a sequence of phonemes , the system comprising : means for capturing the input speech signal ;
means for segmenting the input speech signal into a series of segments including vocalic intervals and non-vocalic intervals , the vocalic intervals having a frame length computed based on an estimation of the pitch period of the vocalic intervals of the input speech signal , the series of segments approximately corresponding to the sequence of phonemes ;
means for characterizing the acoustic and spectral characteristics of the input speech signal : means for storing a dictionary having a multiplicity of words , each one of the multiplicity of words described by a series of acoustic events , spectral characteristics and a series of grammatical , semantic and syntactic attributes ;
and means for selecting a word choice from amongst the multiplicity of words by weighing , for one or more of the multiplicity of words , correspondence between the acoustic events and the acoustic characteristics (speech signal) , between the spectral characteristics of the words and the input speech signal , and the grammatical , semantic and syntactic attributes of the words .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (acoustic characteristics) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5799276A
CLAIM 24
. A knowledge-based speech recognition system for recognizing an input speech signal , the input speech signal including a sequence of phonemes , the system comprising : means for capturing the input speech signal ;
means for segmenting the input speech signal into a series of segments including vocalic intervals and non-vocalic intervals , the vocalic intervals having a frame length computed based on an estimation of the pitch period of the vocalic intervals of the input speech signal , the series of segments approximately corresponding to the sequence of phonemes ;
means for characterizing the acoustic and spectral characteristics of the input speech signal : means for storing a dictionary having a multiplicity of words , each one of the multiplicity of words described by a series of acoustic events , spectral characteristics and a series of grammatical , semantic and syntactic attributes ;
and means for selecting a word choice from amongst the multiplicity of words by weighing , for one or more of the multiplicity of words , correspondence between the acoustic events and the acoustic characteristics (speech signal) , between the spectral characteristics of the words and the input speech signal , and the grammatical , semantic and syntactic attributes of the words .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (acoustic characteristics) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5799276A
CLAIM 24
. A knowledge-based speech recognition system for recognizing an input speech signal , the input speech signal including a sequence of phonemes , the system comprising : means for capturing the input speech signal ;
means for segmenting the input speech signal into a series of segments including vocalic intervals and non-vocalic intervals , the vocalic intervals having a frame length computed based on an estimation of the pitch period of the vocalic intervals of the input speech signal , the series of segments approximately corresponding to the sequence of phonemes ;
means for characterizing the acoustic and spectral characteristics of the input speech signal : means for storing a dictionary having a multiplicity of words , each one of the multiplicity of words described by a series of acoustic events , spectral characteristics and a series of grammatical , semantic and syntactic attributes ;
and means for selecting a word choice from amongst the multiplicity of words by weighing , for one or more of the multiplicity of words , correspondence between the acoustic events and the acoustic characteristics (speech signal) , between the spectral characteristics of the words and the input speech signal , and the grammatical , semantic and syntactic attributes of the words .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (pitch period) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5799276A
CLAIM 1
. A knowledge-based speech recognition system for recognizing an input speech signal comprising vocalic and non-vocalic intervals , each of the vocalic intervals having a pitch period (decoder concealment, pitch period, decoder determines concealment) , the system comprising : means for capturing the input speech signal ;
means for segmenting the input speech signal into a series of segments including vocalic intervals and non-vocalic intervals , the vocalic intervals having a frame length computed based on an estimation of the pitch period of the vocalic intervals of the input speech signal ;
means for characterizing the series of segments based upon acoustic events detected within the input speech signal to obtain an acoustic feature vector ;
means for storing a dictionary having a multiplicity of words , each one of the multiplicity of words described by a phonetic transcription and at least one acoustic event transcription ;
and means for selecting a word choice by comparing the acoustic feature vector to the acoustic event transcriptions of the multiplicity of words .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (acoustic characteristics) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5799276A
CLAIM 24
. A knowledge-based speech recognition system for recognizing an input speech signal , the input speech signal including a sequence of phonemes , the system comprising : means for capturing the input speech signal ;
means for segmenting the input speech signal into a series of segments including vocalic intervals and non-vocalic intervals , the vocalic intervals having a frame length computed based on an estimation of the pitch period of the vocalic intervals of the input speech signal , the series of segments approximately corresponding to the sequence of phonemes ;
means for characterizing the acoustic and spectral characteristics of the input speech signal : means for storing a dictionary having a multiplicity of words , each one of the multiplicity of words described by a series of acoustic events , spectral characteristics and a series of grammatical , semantic and syntactic attributes ;
and means for selecting a word choice from amongst the multiplicity of words by weighing , for one or more of the multiplicity of words , correspondence between the acoustic events and the acoustic characteristics (speech signal) , between the spectral characteristics of the words and the input speech signal , and the grammatical , semantic and syntactic attributes of the words .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5596676A

Filed: 1995-10-11     Issued: 1997-01-21

Mode-specific method and apparatus for encoding signals containing speech

(Original Assignee) Hughes Electronics Corp     (Current Assignee) JPMorgan Chase Bank NA ; Hughes Network Systems LLC

Kumar Swaminathan, Kalyan Ganesan, Prabhat K. Gupta
US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (second line) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5596676A
CLAIM 1
. A method of encoding a signal having a speech component , the signal being organized as a plurality of frames , the method comprising the steps , performed for each frame , of : analyzing a first linear prediction window to generate a first set of filter coefficients for a frame ;
analyzing a second line (last non) ar prediction window to generate a second set of filter coefficients for the frame ;
analyzing a first pitch analysis window to generate a first pitch estimate for the frame ;
analyzing a second pitch analysis window to generate a second pitch estimate for the frame ;
determining whether the frame is one of a first mode , a second mode and a third mode , depending on measures of energy content of the frame and spectral content of the frame ;
encoding the frame , depending on the second set of filter coefficients and the first and the second pitch estimates , independently of the first set of filter coefficients , when the frame is determined to be the third mode ;
encoding the frame , depending on the first and the second sets of filter coefficients , independently of the first and the second pitch estimates , when the frame is determined to be the second mode ;
and encoding the frame , depending on the second set of filter coefficients , independently of the first set of filter coefficients and the first and the second pitch estimates , when the frame is determined to be the first mode .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (second line) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5596676A
CLAIM 1
. A method of encoding a signal having a speech component , the signal being organized as a plurality of frames , the method comprising the steps , performed for each frame , of : analyzing a first linear prediction window to generate a first set of filter coefficients for a frame ;
analyzing a second line (last non) ar prediction window to generate a second set of filter coefficients for the frame ;
analyzing a first pitch analysis window to generate a first pitch estimate for the frame ;
analyzing a second pitch analysis window to generate a second pitch estimate for the frame ;
determining whether the frame is one of a first mode , a second mode and a third mode , depending on measures of energy content of the frame and spectral content of the frame ;
encoding the frame , depending on the second set of filter coefficients and the first and the second pitch estimates , independently of the first set of filter coefficients , when the frame is determined to be the third mode ;
encoding the frame , depending on the first and the second sets of filter coefficients , independently of the first and the second pitch estimates , when the frame is determined to be the second mode ;
and encoding the frame , depending on the second set of filter coefficients , independently of the first set of filter coefficients and the first and the second pitch estimates , when the frame is determined to be the first mode .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (second line) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5596676A
CLAIM 1
. A method of encoding a signal having a speech component , the signal being organized as a plurality of frames , the method comprising the steps , performed for each frame , of : analyzing a first linear prediction window to generate a first set of filter coefficients for a frame ;
analyzing a second line (last non) ar prediction window to generate a second set of filter coefficients for the frame ;
analyzing a first pitch analysis window to generate a first pitch estimate for the frame ;
analyzing a second pitch analysis window to generate a second pitch estimate for the frame ;
determining whether the frame is one of a first mode , a second mode and a third mode , depending on measures of energy content of the frame and spectral content of the frame ;
encoding the frame , depending on the second set of filter coefficients and the first and the second pitch estimates , independently of the first set of filter coefficients , when the frame is determined to be the third mode ;
encoding the frame , depending on the first and the second sets of filter coefficients , independently of the first and the second pitch estimates , when the frame is determined to be the second mode ;
and encoding the frame , depending on the second set of filter coefficients , independently of the first set of filter coefficients and the first and the second pitch estimates , when the frame is determined to be the first mode .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (second line) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5596676A
CLAIM 1
. A method of encoding a signal having a speech component , the signal being organized as a plurality of frames , the method comprising the steps , performed for each frame , of : analyzing a first linear prediction window to generate a first set of filter coefficients for a frame ;
analyzing a second line (last non) ar prediction window to generate a second set of filter coefficients for the frame ;
analyzing a first pitch analysis window to generate a first pitch estimate for the frame ;
analyzing a second pitch analysis window to generate a second pitch estimate for the frame ;
determining whether the frame is one of a first mode , a second mode and a third mode , depending on measures of energy content of the frame and spectral content of the frame ;
encoding the frame , depending on the second set of filter coefficients and the first and the second pitch estimates , independently of the first set of filter coefficients , when the frame is determined to be the third mode ;
encoding the frame , depending on the first and the second sets of filter coefficients , independently of the first and the second pitch estimates , when the frame is determined to be the second mode ;
and encoding the frame , depending on the second set of filter coefficients , independently of the first set of filter coefficients and the first and the second pitch estimates , when the frame is determined to be the first mode .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (second line) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5596676A
CLAIM 1
. A method of encoding a signal having a speech component , the signal being organized as a plurality of frames , the method comprising the steps , performed for each frame , of : analyzing a first linear prediction window to generate a first set of filter coefficients for a frame ;
analyzing a second line (last non) ar prediction window to generate a second set of filter coefficients for the frame ;
analyzing a first pitch analysis window to generate a first pitch estimate for the frame ;
analyzing a second pitch analysis window to generate a second pitch estimate for the frame ;
determining whether the frame is one of a first mode , a second mode and a third mode , depending on measures of energy content of the frame and spectral content of the frame ;
encoding the frame , depending on the second set of filter coefficients and the first and the second pitch estimates , independently of the first set of filter coefficients , when the frame is determined to be the third mode ;
encoding the frame , depending on the first and the second sets of filter coefficients , independently of the first and the second pitch estimates , when the frame is determined to be the second mode ;
and encoding the frame , depending on the second set of filter coefficients , independently of the first set of filter coefficients and the first and the second pitch estimates , when the frame is determined to be the first mode .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (second line) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5596676A
CLAIM 1
. A method of encoding a signal having a speech component , the signal being organized as a plurality of frames , the method comprising the steps , performed for each frame , of : analyzing a first linear prediction window to generate a first set of filter coefficients for a frame ;
analyzing a second line (last non) ar prediction window to generate a second set of filter coefficients for the frame ;
analyzing a first pitch analysis window to generate a first pitch estimate for the frame ;
analyzing a second pitch analysis window to generate a second pitch estimate for the frame ;
determining whether the frame is one of a first mode , a second mode and a third mode , depending on measures of energy content of the frame and spectral content of the frame ;
encoding the frame , depending on the second set of filter coefficients and the first and the second pitch estimates , independently of the first set of filter coefficients , when the frame is determined to be the third mode ;
encoding the frame , depending on the first and the second sets of filter coefficients , independently of the first and the second pitch estimates , when the frame is determined to be the second mode ;
and encoding the frame , depending on the second set of filter coefficients , independently of the first set of filter coefficients and the first and the second pitch estimates , when the frame is determined to be the first mode .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5835495A

Filed: 1995-10-11     Issued: 1998-11-10

System and method for scaleable streamed audio transmission over a network

(Original Assignee) Microsoft Corp     (Current Assignee) Microsoft Technology Licensing LLC

Philippe Ferriere
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (audio frames) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5835495A
CLAIM 19
. An audio file distribution system comprising : a client computing unit having a modem operating to receive digital data at an effective bit rate ;
an audio file database to store a plurality of audio files ;
and an audio server to retrieve multiple audio files from the audio file database and encode the audio files as individual audio data blocks which contain a certain number bits of digital audio data that have been sampled at a selected sampling rate wherein the number of bits of digital data and the sampling rate are selected to provide an encoded bit stream bit rate that is less than or equal to the effective bit rate of the client' ;
s modem ;
and the client computing unit is configured to decode the audio data blocks into audio frames (onset frame) , mix the audio frames , and reproduce sound from the audio frames .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy (digital audio samples) of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US5835495A
CLAIM 20
. A communication system comprising : first and second communication units , each communication unit being equipped with a modem operating to receive and transmit data at an effective bit rate ;
a network interconnecting the first and second communication units ;
said first communication unit being configured to supply the effective bit rate of the modem for the first communication unit to the second communication unit ;
said second communication unit being configured to determine a smallest effective bit rate from between the effective bit rates for the modems of the first and second communication units and to send the smallest effective bit rate back to the first communication unit ;
said first and second communication units being configured to generate digital audio samples (controlling energy) representative of an audio signal at a selected input sampling rate , the first and second communication units being equipped with an audio coder/decoder having multiple quantizers that encode the digital audio samples into various sized audio data blocks which contain various quantities of bits of the audio samples , said first and second communication units using an appropriate input sampling rate and selecting an appropriate quantizer to encode the audio samples into audio data blocks that yield an encoded bit stream bit rate less than or equal to the smallest effective bit rate of the modems ;
and said first and second communication units exchanging the audio data blocks over the network .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5835495A
CLAIM 3
. A method for supplying digital audio files to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the audio files being configured into individual audio data blocks wherein each audio data block contains a certain number of bits of digital audio data sampled at an input sampling rate , the method comprising the following steps : storing multiple versions of an audio file , each version of the audio file being configured in audio data blocks of different block sizes and produced using different input sampling rates , wherein the block size represents a (LP filter excitation signal) number of data bits contained within an individual audio data block ;
choosing an appropriate version of the audio file that has a block size and an input sampling rate which produces a bit stream bit rate that is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks at the bit stream bit rate to the recipient' ;
s modem .

US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5835495A
CLAIM 3
. A method for supplying digital audio files to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the audio files being configured into individual audio data blocks wherein each audio data block contains a certain number of bits of digital audio data sampled at an input sampling rate , the method comprising the following steps : storing multiple versions of an audio file , each version of the audio file being configured in audio data blocks of different block sizes and produced using different input sampling rates , wherein the block size represents a (LP filter excitation signal) number of data bits contained within an individual audio data block ;
choosing an appropriate version of the audio file that has a block size and an input sampling rate which produces a bit stream bit rate that is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5835495A
CLAIM 3
. A method for supplying digital audio files to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the audio files being configured into individual audio data blocks wherein each audio data block contains a certain number of bits of digital audio data sampled at an input sampling rate , the method comprising the following steps : storing multiple versions of an audio file , each version of the audio file being configured in audio data blocks of different block sizes and produced using different input sampling rates , wherein the block size represents a (LP filter excitation signal) number of data bits contained within an individual audio data block ;
choosing an appropriate version of the audio file that has a block size and an input sampling rate which produces a bit stream bit rate that is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks at the bit stream bit rate to the recipient' ;
s modem .

US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (audio frames) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5835495A
CLAIM 19
. An audio file distribution system comprising : a client computing unit having a modem operating to receive digital data at an effective bit rate ;
an audio file database to store a plurality of audio files ;
and an audio server to retrieve multiple audio files from the audio file database and encode the audio files as individual audio data blocks which contain a certain number bits of digital audio data that have been sampled at a selected sampling rate wherein the number of bits of digital data and the sampling rate are selected to provide an encoded bit stream bit rate that is less than or equal to the effective bit rate of the client' ;
s modem ;
and the client computing unit is configured to decode the audio data blocks into audio frames (onset frame) , mix the audio frames , and reproduce sound from the audio frames .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5835495A
CLAIM 3
. A method for supplying digital audio files to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the audio files being configured into individual audio data blocks wherein each audio data block contains a certain number of bits of digital audio data sampled at an input sampling rate , the method comprising the following steps : storing multiple versions of an audio file , each version of the audio file being configured in audio data blocks of different block sizes and produced using different input sampling rates , wherein the block size represents a (LP filter excitation signal) number of data bits contained within an individual audio data block ;
choosing an appropriate version of the audio file that has a block size and an input sampling rate which produces a bit stream bit rate that is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks at the bit stream bit rate to the recipient' ;
s modem .

US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5835495A
CLAIM 3
. A method for supplying digital audio files to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the audio files being configured into individual audio data blocks wherein each audio data block contains a certain number of bits of digital audio data sampled at an input sampling rate , the method comprising the following steps : storing multiple versions of an audio file , each version of the audio file being configured in audio data blocks of different block sizes and produced using different input sampling rates , wherein the block size represents a (LP filter excitation signal) number of data bits contained within an individual audio data block ;
choosing an appropriate version of the audio file that has a block size and an input sampling rate which produces a bit stream bit rate that is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (selecting step) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5835495A
CLAIM 3
. A method for supplying digital audio files to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the audio files being configured into individual audio data blocks wherein each audio data block contains a certain number of bits of digital audio data sampled at an input sampling rate , the method comprising the following steps : storing multiple versions of an audio file , each version of the audio file being configured in audio data blocks of different block sizes and produced using different input sampling rates , wherein the block size represents a (LP filter excitation signal) number of data bits contained within an individual audio data block ;
choosing an appropriate version of the audio file that has a block size and an input sampling rate which produces a bit stream bit rate that is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks at the bit stream bit rate to the recipient' ;
s modem .

US5835495A
CLAIM 11
. A method for transmitting multiple digital audio files concurrently to a recipient , the recipient having a modem operating to receive digital data at an effective bit rate , the method comprising the following steps : performing for each digital audio file the following steps : configuring the audio file into individual audio data blocks , each audio data block containing a certain number bits of digital audio data sampled at an input sampling rate ;
selecting a block size for the audio data blocks from among a set of available block sizes and an input sampling rate from among a set of available input sampling rates that determine a bit stream bit rate , the block size representing the number of bits of digital audio data in an individual audio data block ;
said selecting step (signal classification parameter) selecting the block size and the input sampling rate for each audio file which ensure that a total bit stream bit rate made up of combined bit stream bit rates of all digital audio files is less than or equal to the effective bit rate of the recipient' ;
s modem ;
and transmitting the audio data blocks for each audio file at the bit stream bit rate to the recipient' ;
s modem .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5774839A

Filed: 1995-09-29     Issued: 1998-06-30

Delayed decision switched prediction multi-stage LSF vector quantization

(Original Assignee) Rockwell International Corp     (Current Assignee) Nytell Software LLC

Eyal Shlomot
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response (distance measure) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5774839A
CLAIM 1
. In a communication system for communicating input signals using a digital medium . the communication system comprising an encoder which receives and processes the input signals to generate a quantized data vector for either transmission or storage by the digital medium , the encoder comprising an analyzer for analyzing the input signals to generate a set of representative parameters associated with the input signals , and a quantizer for quantizing a sequence of data vectors from among the set of representative Darameters corresponding to the input signals to generate the quantized data vector , the quantizer comprising : switched prediction means comprising a set of predictors for predicting a next vector element from said sequence of input data vectors to generate a set of prediction vectors ;
difference means coupled to said switched prediction means for subtracting said set of prediction vectors from said next vector element to generate a set of prediction error vectors ;
vector quantization means comprising a predetermined set of tables for quantizing said set of prediction error vectors to generate a set of quantized prediction error vectors , said vector quantization means comprising a plurality of stages , each of said plurality of stages comprising at least one of said set of tables and local decision means , wherein : a first stage quantizes said set of prediction error vectors from said difference means to generate a first set of candidates of quantization error vectors , by selecting , for each candidate in said first set of candidates , a prediction error vector and at least one entry from at least one of said set of tables according to a predetermined distance measure (impulse response) ;
a final stage , coupled to said first stage , quantizes said first set of candidates of quantization error vectors from first stage , to generate a final quantization error vector by selecting a member of said first set of candidates of quantization error vectors from said first stage and at least one entry from at least one of said set of tables , according to said predetermined distance measure ;
global decision means for selecting one predictor out of said set of predictors from said switched prediction means and selecting , for each of said first and final stages , at least one entry from said set of tables of said vector quantization means according to said predetermined distance measure , generating said quantized data vector .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (distance measure) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5774839A
CLAIM 1
. In a communication system for communicating input signals using a digital medium . the communication system comprising an encoder which receives and processes the input signals to generate a quantized data vector for either transmission or storage by the digital medium , the encoder comprising an analyzer for analyzing the input signals to generate a set of representative parameters associated with the input signals , and a quantizer for quantizing a sequence of data vectors from among the set of representative Darameters corresponding to the input signals to generate the quantized data vector , the quantizer comprising : switched prediction means comprising a set of predictors for predicting a next vector element from said sequence of input data vectors to generate a set of prediction vectors ;
difference means coupled to said switched prediction means for subtracting said set of prediction vectors from said next vector element to generate a set of prediction error vectors ;
vector quantization means comprising a predetermined set of tables for quantizing said set of prediction error vectors to generate a set of quantized prediction error vectors , said vector quantization means comprising a plurality of stages , each of said plurality of stages comprising at least one of said set of tables and local decision means , wherein : a first stage quantizes said set of prediction error vectors from said difference means to generate a first set of candidates of quantization error vectors , by selecting , for each candidate in said first set of candidates , a prediction error vector and at least one entry from at least one of said set of tables according to a predetermined distance measure (impulse response) ;
a final stage , coupled to said first stage , quantizes said first set of candidates of quantization error vectors from first stage , to generate a final quantization error vector by selecting a member of said first set of candidates of quantization error vectors from said first stage and at least one entry from at least one of said set of tables , according to said predetermined distance measure ;
global decision means for selecting one predictor out of said set of predictors from said switched prediction means and selecting , for each of said first and final stages , at least one entry from said set of tables of said vector quantization means according to said predetermined distance measure , generating said quantized data vector .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (distance measure) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5774839A
CLAIM 1
. In a communication system for communicating input signals using a digital medium . the communication system comprising an encoder which receives and processes the input signals to generate a quantized data vector for either transmission or storage by the digital medium , the encoder comprising an analyzer for analyzing the input signals to generate a set of representative parameters associated with the input signals , and a quantizer for quantizing a sequence of data vectors from among the set of representative Darameters corresponding to the input signals to generate the quantized data vector , the quantizer comprising : switched prediction means comprising a set of predictors for predicting a next vector element from said sequence of input data vectors to generate a set of prediction vectors ;
difference means coupled to said switched prediction means for subtracting said set of prediction vectors from said next vector element to generate a set of prediction error vectors ;
vector quantization means comprising a predetermined set of tables for quantizing said set of prediction error vectors to generate a set of quantized prediction error vectors , said vector quantization means comprising a plurality of stages , each of said plurality of stages comprising at least one of said set of tables and local decision means , wherein : a first stage quantizes said set of prediction error vectors from said difference means to generate a first set of candidates of quantization error vectors , by selecting , for each candidate in said first set of candidates , a prediction error vector and at least one entry from at least one of said set of tables according to a predetermined distance measure (impulse response) ;
a final stage , coupled to said first stage , quantizes said first set of candidates of quantization error vectors from first stage , to generate a final quantization error vector by selecting a member of said first set of candidates of quantization error vectors from said first stage and at least one entry from at least one of said set of tables , according to said predetermined distance measure ;
global decision means for selecting one predictor out of said set of predictors from said switched prediction means and selecting , for each of said first and final stages , at least one entry from said set of tables of said vector quantization means according to said predetermined distance measure , generating said quantized data vector .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response (distance measure) of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5774839A
CLAIM 1
. In a communication system for communicating input signals using a digital medium . the communication system comprising an encoder which receives and processes the input signals to generate a quantized data vector for either transmission or storage by the digital medium , the encoder comprising an analyzer for analyzing the input signals to generate a set of representative parameters associated with the input signals , and a quantizer for quantizing a sequence of data vectors from among the set of representative Darameters corresponding to the input signals to generate the quantized data vector , the quantizer comprising : switched prediction means comprising a set of predictors for predicting a next vector element from said sequence of input data vectors to generate a set of prediction vectors ;
difference means coupled to said switched prediction means for subtracting said set of prediction vectors from said next vector element to generate a set of prediction error vectors ;
vector quantization means comprising a predetermined set of tables for quantizing said set of prediction error vectors to generate a set of quantized prediction error vectors , said vector quantization means comprising a plurality of stages , each of said plurality of stages comprising at least one of said set of tables and local decision means , wherein : a first stage quantizes said set of prediction error vectors from said difference means to generate a first set of candidates of quantization error vectors , by selecting , for each candidate in said first set of candidates , a prediction error vector and at least one entry from at least one of said set of tables according to a predetermined distance measure (impulse response) ;
a final stage , coupled to said first stage , quantizes said first set of candidates of quantization error vectors from first stage , to generate a final quantization error vector by selecting a member of said first set of candidates of quantization error vectors from said first stage and at least one entry from at least one of said set of tables , according to said predetermined distance measure ;
global decision means for selecting one predictor out of said set of predictors from said switched prediction means and selecting , for each of said first and final stages , at least one entry from said set of tables of said vector quantization means according to said predetermined distance measure , generating said quantized data vector .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (distance measure) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5774839A
CLAIM 1
. In a communication system for communicating input signals using a digital medium . the communication system comprising an encoder which receives and processes the input signals to generate a quantized data vector for either transmission or storage by the digital medium , the encoder comprising an analyzer for analyzing the input signals to generate a set of representative parameters associated with the input signals , and a quantizer for quantizing a sequence of data vectors from among the set of representative Darameters corresponding to the input signals to generate the quantized data vector , the quantizer comprising : switched prediction means comprising a set of predictors for predicting a next vector element from said sequence of input data vectors to generate a set of prediction vectors ;
difference means coupled to said switched prediction means for subtracting said set of prediction vectors from said next vector element to generate a set of prediction error vectors ;
vector quantization means comprising a predetermined set of tables for quantizing said set of prediction error vectors to generate a set of quantized prediction error vectors , said vector quantization means comprising a plurality of stages , each of said plurality of stages comprising at least one of said set of tables and local decision means , wherein : a first stage quantizes said set of prediction error vectors from said difference means to generate a first set of candidates of quantization error vectors , by selecting , for each candidate in said first set of candidates , a prediction error vector and at least one entry from at least one of said set of tables according to a predetermined distance measure (impulse response) ;
a final stage , coupled to said first stage , quantizes said first set of candidates of quantization error vectors from first stage , to generate a final quantization error vector by selecting a member of said first set of candidates of quantization error vectors from said first stage and at least one entry from at least one of said set of tables , according to said predetermined distance measure ;
global decision means for selecting one predictor out of said set of predictors from said switched prediction means and selecting , for each of said first and final stages , at least one entry from said set of tables of said vector quantization means according to said predetermined distance measure , generating said quantized data vector .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (distance measure) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5774839A
CLAIM 1
. In a communication system for communicating input signals using a digital medium . the communication system comprising an encoder which receives and processes the input signals to generate a quantized data vector for either transmission or storage by the digital medium , the encoder comprising an analyzer for analyzing the input signals to generate a set of representative parameters associated with the input signals , and a quantizer for quantizing a sequence of data vectors from among the set of representative Darameters corresponding to the input signals to generate the quantized data vector , the quantizer comprising : switched prediction means comprising a set of predictors for predicting a next vector element from said sequence of input data vectors to generate a set of prediction vectors ;
difference means coupled to said switched prediction means for subtracting said set of prediction vectors from said next vector element to generate a set of prediction error vectors ;
vector quantization means comprising a predetermined set of tables for quantizing said set of prediction error vectors to generate a set of quantized prediction error vectors , said vector quantization means comprising a plurality of stages , each of said plurality of stages comprising at least one of said set of tables and local decision means , wherein : a first stage quantizes said set of prediction error vectors from said difference means to generate a first set of candidates of quantization error vectors , by selecting , for each candidate in said first set of candidates , a prediction error vector and at least one entry from at least one of said set of tables according to a predetermined distance measure (impulse response) ;
a final stage , coupled to said first stage , quantizes said first set of candidates of quantization error vectors from first stage , to generate a final quantization error vector by selecting a member of said first set of candidates of quantization error vectors from said first stage and at least one entry from at least one of said set of tables , according to said predetermined distance measure ;
global decision means for selecting one predictor out of said set of predictors from said switched prediction means and selecting , for each of said first and final stages , at least one entry from said set of tables of said vector quantization means according to said predetermined distance measure , generating said quantized data vector .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5704003A

Filed: 1995-09-19     Issued: 1997-12-30

RCELP coder

(Original Assignee) Nokia of America Corp     (Current Assignee) Nokia of America Corp

Willem Bastiaan Kleijn, Dror Nahumi
US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (average energy) per sample for other frames .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US5704003A
CLAIM 5
. An improved method of speech coding as set forth in claim 4 wherein a sub-frame of the residual signal is time shifted by time shift T only if (a) G opt is greater than or equal to a specified first threshold value , and (b) a peak-to-average ratio is greater than or equal to a specified second threshold value , wherein the peak-to-average ratio is defined as the ratio of the energy of a pulse in a sub-frame of the residual signal to the average energy (average energy) of the residual signal in that sub-frame , thereby eliminating or reducing the undesired introduction of periodicity into non-periodic speech segments .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frame) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5704003A
CLAIM 2
. An improved method of speech coding as set forth in claim 1 wherein the time shift T is determined using a matching criterion defined as ##EQU8## wherein (r(n-T)) is the residual signal of the current frame (current frame, decoder determines concealment) shifted by time T , r(n-D(n)) is the delayed residual signal from a previously-occurring frame , n is a positive integer , r is the instantaneous amplitude of the residual signal , and D(n) represents the sample-to-sample pitch delay determined by applying linear interpolation to known pitch delay values occurring at or near frame-to-frame boundaries .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frame) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US5704003A
CLAIM 2
. An improved method of speech coding as set forth in claim 1 wherein the time shift T is determined using a matching criterion defined as ##EQU8## wherein (r(n-T)) is the residual signal of the current frame (current frame, decoder determines concealment) shifted by time T , r(n-D(n)) is the delayed residual signal from a previously-occurring frame , n is a positive integer , r is the instantaneous amplitude of the residual signal , and D(n) represents the sample-to-sample pitch delay determined by applying linear interpolation to known pitch delay values occurring at or near frame-to-frame boundaries .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (average energy) per sample for other frames .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US5704003A
CLAIM 5
. An improved method of speech coding as set forth in claim 4 wherein a sub-frame of the residual signal is time shifted by time shift T only if (a) G opt is greater than or equal to a specified first threshold value , and (b) a peak-to-average ratio is greater than or equal to a specified second threshold value , wherein the peak-to-average ratio is defined as the ratio of the energy of a pulse in a sub-frame of the residual signal to the average energy (average energy) of the residual signal in that sub-frame , thereby eliminating or reducing the undesired introduction of periodicity into non-periodic speech segments .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frame) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5704003A
CLAIM 2
. An improved method of speech coding as set forth in claim 1 wherein the time shift T is determined using a matching criterion defined as ##EQU8## wherein (r(n-T)) is the residual signal of the current frame (current frame, decoder determines concealment) shifted by time T , r(n-D(n)) is the delayed residual signal from a previously-occurring frame , n is a positive integer , r is the instantaneous amplitude of the residual signal , and D(n) represents the sample-to-sample pitch delay determined by applying linear interpolation to known pitch delay values occurring at or near frame-to-frame boundaries .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (average energy) per sample for other frames .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US5704003A
CLAIM 5
. An improved method of speech coding as set forth in claim 4 wherein a sub-frame of the residual signal is time shifted by time shift T only if (a) G opt is greater than or equal to a specified first threshold value , and (b) a peak-to-average ratio is greater than or equal to a specified second threshold value , wherein the peak-to-average ratio is defined as the ratio of the energy of a pulse in a sub-frame of the residual signal to the average energy (average energy) of the residual signal in that sub-frame , thereby eliminating or reducing the undesired introduction of periodicity into non-periodic speech segments .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (preceding frame) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frame) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5704003A
CLAIM 1
. A method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames , each frame having a plurality of sub-frames including a current sub-frame present during a specified time interval , each frame having a pitch delay value specifying the change in pitch with reference to the immediately preceding frame (signal classification parameter) , each sub-frame including a plurality of samples , and the digitized speech is partitioned into periodic components and a residual signal ;
the improved method of speech coding comprising the steps of : (a) for each of a plurality of sub-frames of the residual signal , determining a time shift T based upon (i) the current sub-frame of the residual signal , and (ii) a delayed residual signal from a previously-occurring frame ;
and (b) applying the time shift T determined in step (a) to the current sub-frame of the residual signal .

US5704003A
CLAIM 2
. An improved method of speech coding as set forth in claim 1 wherein the time shift T is determined using a matching criterion defined as ##EQU8## wherein (r(n-T)) is the residual signal of the current frame (current frame, decoder determines concealment) shifted by time T , r(n-D(n)) is the delayed residual signal from a previously-occurring frame , n is a positive integer , r is the instantaneous amplitude of the residual signal , and D(n) represents the sample-to-sample pitch delay determined by applying linear interpolation to known pitch delay values occurring at or near frame-to-frame boundaries .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5774837A

Filed: 1995-09-13     Issued: 1998-06-30

Speech coding system and method using voicing probability determination

(Original Assignee) Voxware Inc     (Current Assignee) Voxware Inc

Suat Yeldener, Joseph Gerard Aguilar
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (generating filter) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (generating filter) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response (generating filter) up to the end of a last subframe affected by the artificial construction of the periodic part .
US5774837A
CLAIM 32
. The system of claim 31 wherein the parameters representative of the unvoiced portion of the signal are related to the LPC coefficients for the unvoiced portion of the signal and the means for synthesizing unvoiced speech further comprises : means for generating filter (first impulse, first impulse response, impulse responses, impulse response) ed white noise signal ;
means for selecting on the basis of the voicing probability Pv of a filtered white noise excitation signal ;
and a time varying autoregressive digital filter the coefficients of which are determined by the parameters representing the unvoiced portion of the signal .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) , and a phase information parameter (boundary condition) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) , and a phase information parameter (boundary condition) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) , and a phase information parameter (boundary condition) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal, initial phase) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5774837A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and the step of detecting the presence of a fundamental frequency F 0 comprises the step of computing the spectrum of the signal .

US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 18
. The method of claim 17 wherein the parameters representative of the voiced portion of the signal comprise a set of amplitudes for harmonic frequencies within the voiced portion of the spectrum , and the step of synthesizing a voiced speech further comprises the steps of : determining the initial phase (speech signal, decoder determines concealment) offsets for each harmonic frequency ;
and synthesizing voiced speech using the encoded sequence of amplitudes of harmonic frequencies and the determined phase offsets .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) , and a phase information parameter (boundary condition) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal, initial phase) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5774837A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and the step of detecting the presence of a fundamental frequency F 0 comprises the step of computing the spectrum of the signal .

US5774837A
CLAIM 18
. The method of claim 17 wherein the parameters representative of the voiced portion of the signal comprise a set of amplitudes for harmonic frequencies within the voiced portion of the spectrum , and the step of synthesizing a voiced speech further comprises the steps of : determining the initial phase (speech signal, decoder determines concealment) offsets for each harmonic frequency ;
and synthesizing voiced speech using the encoded sequence of amplitudes of harmonic frequencies and the determined phase offsets .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal, initial phase) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5774837A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and the step of detecting the presence of a fundamental frequency F 0 comprises the step of computing the spectrum of the signal .

US5774837A
CLAIM 18
. The method of claim 17 wherein the parameters representative of the voiced portion of the signal comprise a set of amplitudes for harmonic frequencies within the voiced portion of the spectrum , and the step of synthesizing a voiced speech further comprises the steps of : determining the initial phase (speech signal, decoder determines concealment) offsets for each harmonic frequency ;
and synthesizing voiced speech using the encoded sequence of amplitudes of harmonic frequencies and the determined phase offsets .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) , and a phase information parameter (boundary condition) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (synthesized signal) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US5774837A
CLAIM 23
. The method of claim 22 further comprising the step of generating voice effects by varying the length of the synthesized signal (LP filter excitation signal) segments and adjusting the amplitudes and frequencies of the harmonics to a target range of values on the basis of a linear interpolation of the parameters encoded in the data packet .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (synthesized signal) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (generating filter) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5774837A
CLAIM 23
. The method of claim 22 further comprising the step of generating voice effects by varying the length of the synthesized signal (LP filter excitation signal) segments and adjusting the amplitudes and frequencies of the harmonics to a target range of values on the basis of a linear interpolation of the parameters encoded in the data packet .

US5774837A
CLAIM 32
. The system of claim 31 wherein the parameters representative of the unvoiced portion of the signal are related to the LPC coefficients for the unvoiced portion of the signal and the means for synthesizing unvoiced speech further comprises : means for generating filter (first impulse, first impulse response, impulse responses, impulse response) ed white noise signal ;
means for selecting on the basis of the voicing probability Pv of a filtered white noise excitation signal ;
and a time varying autoregressive digital filter the coefficients of which are determined by the parameters representing the unvoiced portion of the signal .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter (boundary condition) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter (boundary condition) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter (boundary condition) related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (synthesized signal) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response (generating filter) of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US5774837A
CLAIM 23
. The method of claim 22 further comprising the step of generating voice effects by varying the length of the synthesized signal (LP filter excitation signal) segments and adjusting the amplitudes and frequencies of the harmonics to a target range of values on the basis of a linear interpolation of the parameters encoded in the data packet .

US5774837A
CLAIM 32
. The system of claim 31 wherein the parameters representative of the unvoiced portion of the signal are related to the LPC coefficients for the unvoiced portion of the signal and the means for synthesizing unvoiced speech further comprises : means for generating filter (first impulse, first impulse response, impulse responses, impulse response) ed white noise signal ;
means for selecting on the basis of the voicing probability Pv of a filtered white noise excitation signal ;
and a time varying autoregressive digital filter the coefficients of which are determined by the parameters representing the unvoiced portion of the signal .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs (encoded parameter) , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (generating filter) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (generating filter) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response (generating filter) up to an end of a last subframe affected by the artificial construction of the periodic part .
US5774837A
CLAIM 15
. A method for synthesizing audio signals from data packets , each data packet representing a time segment of a signal , said at least one data packet comprising : a fundamental frequency parameter , voicing probability Pv defined as a ratio between voiced and unvoiced components of the signal in the segment , and a sequence of encoded parameter (decoder constructs) s representative of the voiced portion and the unvoiced portion of the signal , the method comprising the steps of : decoding at least one data packet to extract said fundamental frequency , the number of harmonics H corresponding to said fundamental frequency said voicing probability Pv and said sequence of encoded parameters representative of the voiced and unvoiced portions of the signal ;
and synthesizing an audio signal in response to the detected fundamental frequency , wherein the low frequency band of the spectrum is synthesized using only parameters representative of the voiced portion of the signal ;
the high frequency band of the spectrum is synthesized using only parameters representative of the unvoiced portion of the signal and the boundary between the low frequency band and the high frequency band of the spectrum is determined on the basis of the decoded voicing probability Pv and the number of harmonics H .

US5774837A
CLAIM 32
. The system of claim 31 wherein the parameters representative of the unvoiced portion of the signal are related to the LPC coefficients for the unvoiced portion of the signal and the means for synthesizing unvoiced speech further comprises : means for generating filter (first impulse, first impulse response, impulse responses, impulse response) ed white noise signal ;
means for selecting on the basis of the voicing probability Pv of a filtered white noise excitation signal ;
and a time varying autoregressive digital filter the coefficients of which are determined by the parameters representing the unvoiced portion of the signal .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal, initial phase) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5774837A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and the step of detecting the presence of a fundamental frequency F 0 comprises the step of computing the spectrum of the signal .

US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 18
. The method of claim 17 wherein the parameters representative of the voiced portion of the signal comprise a set of amplitudes for harmonic frequencies within the voiced portion of the spectrum , and the step of synthesizing a voiced speech further comprises the steps of : determining the initial phase (speech signal, decoder determines concealment) offsets for each harmonic frequency ;
and synthesizing voiced speech using the encoded sequence of amplitudes of harmonic frequencies and the determined phase offsets .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal, initial phase) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5774837A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and the step of detecting the presence of a fundamental frequency F 0 comprises the step of computing the spectrum of the signal .

US5774837A
CLAIM 18
. The method of claim 17 wherein the parameters representative of the voiced portion of the signal comprise a set of amplitudes for harmonic frequencies within the voiced portion of the spectrum , and the step of synthesizing a voiced speech further comprises the steps of : determining the initial phase (speech signal, decoder determines concealment) offsets for each harmonic frequency ;
and synthesizing voiced speech using the encoded sequence of amplitudes of harmonic frequencies and the determined phase offsets .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal, initial phase) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5774837A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and the step of detecting the presence of a fundamental frequency F 0 comprises the step of computing the spectrum of the signal .

US5774837A
CLAIM 18
. The method of claim 17 wherein the parameters representative of the voiced portion of the signal comprise a set of amplitudes for harmonic frequencies within the voiced portion of the spectrum , and the step of synthesizing a voiced speech further comprises the steps of : determining the initial phase (speech signal, decoder determines concealment) offsets for each harmonic frequency ;
and synthesizing voiced speech using the encoded sequence of amplitudes of harmonic frequencies and the determined phase offsets .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (synthesized signal) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US5774837A
CLAIM 23
. The method of claim 22 further comprising the step of generating voice effects by varying the length of the synthesized signal (LP filter excitation signal) segments and adjusting the amplitudes and frequencies of the harmonics to a target range of values on the basis of a linear interpolation of the parameters encoded in the data packet .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (synthesized signal) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (generating filter) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5774837A
CLAIM 23
. The method of claim 22 further comprising the step of generating voice effects by varying the length of the synthesized signal (LP filter excitation signal) segments and adjusting the amplitudes and frequencies of the harmonics to a target range of values on the basis of a linear interpolation of the parameters encoded in the data packet .

US5774837A
CLAIM 32
. The system of claim 31 wherein the parameters representative of the unvoiced portion of the signal are related to the LPC coefficients for the unvoiced portion of the signal and the means for synthesizing unvoiced speech further comprises : means for generating filter (first impulse, first impulse response, impulse responses, impulse response) ed white noise signal ;
means for selecting on the basis of the voicing probability Pv of a filtered white noise excitation signal ;
and a time varying autoregressive digital filter the coefficients of which are determined by the parameters representing the unvoiced portion of the signal .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal, initial phase) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5774837A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and the step of detecting the presence of a fundamental frequency F 0 comprises the step of computing the spectrum of the signal .

US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 18
. The method of claim 17 wherein the parameters representative of the voiced portion of the signal comprise a set of amplitudes for harmonic frequencies within the voiced portion of the spectrum , and the step of synthesizing a voiced speech further comprises the steps of : determining the initial phase (speech signal, decoder determines concealment) offsets for each harmonic frequency ;
and synthesizing voiced speech using the encoded sequence of amplitudes of harmonic frequencies and the determined phase offsets .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (low end) and a phase information parameter (boundary condition) related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (synthesized signal) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response (generating filter) of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5774837A
CLAIM 3
. The method of claim 2 wherein the voiced portion of the signal occupies the low end (energy information parameter) of the spectrum and the unvoiced portion of the signal occupies the high end of the spectrum for each segment .

US5774837A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1) . o slashed . . sup . - (M)+ξ . sup . - (h) , where . o slashed . - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US5774837A
CLAIM 23
. The method of claim 22 further comprising the step of generating voice effects by varying the length of the synthesized signal (LP filter excitation signal) segments and adjusting the amplitudes and frequencies of the harmonics to a target range of values on the basis of a linear interpolation of the parameters encoded in the data packet .

US5774837A
CLAIM 32
. The system of claim 31 wherein the parameters representative of the unvoiced portion of the signal are related to the LPC coefficients for the unvoiced portion of the signal and the means for synthesizing unvoiced speech further comprises : means for generating filter (first impulse, first impulse response, impulse responses, impulse response) ed white noise signal ;
means for selecting on the basis of the voicing probability Pv of a filtered white noise excitation signal ;
and a time varying autoregressive digital filter the coefficients of which are determined by the parameters representing the unvoiced portion of the signal .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5749065A

Filed: 1995-08-23     Issued: 1998-05-05

Speech encoding method, speech decoding method and speech encoding/decoding method

(Original Assignee) Sony Corp     (Current Assignee) Sony Corp

Masayuki Nishiguchi, Jun Matsumoto
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe (fixed number) affected by the artificial construction of the periodic part .
US5749065A
CLAIM 13
. A method for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal which is divided on a time axis on a block basis , the short-term prediction residue being represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the method for decoding comprising the steps of : finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number (last subframe) of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing a time-axis waveform signal based on the short-term prediction residual waveform of the encoded speech signal .

US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (decoding apparatus) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (signal parameters) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US5749065A
CLAIM 39
. The speech decoding apparatus as claimed in claim 38 , wherein the computation means outputs a linear predictive coding (LPC) residue as the short-term prediction residue , and wherein the synthesizing means employs as the encoded speech signal parameters (LP filter) respectively representing LPC coefficients , pitch information representing a basic period of the LPC residue , index information from vector quantization or matrix quantization of a spectral envelope of the LPC residue and information indicating whether the input speech signal is voice or unvoiced .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (signal parameters) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5749065A
CLAIM 39
. The speech decoding apparatus as claimed in claim 38 , wherein the computation means outputs a linear predictive coding (LPC) residue as the short-term prediction residue , and wherein the synthesizing means employs as the encoded speech signal parameters (LP filter) respectively representing LPC coefficients , pitch information representing a basic period of the LPC residue , index information from vector quantization or matrix quantization of a spectral envelope of the LPC residue and information indicating whether the input speech signal is voice or unvoiced .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (decoding apparatus) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (signal parameters) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US5749065A
CLAIM 39
. The speech decoding apparatus as claimed in claim 38 , wherein the computation means outputs a linear predictive coding (LPC) residue as the short-term prediction residue , and wherein the synthesizing means employs as the encoded speech signal parameters (LP filter) respectively representing LPC coefficients , pitch information representing a basic period of the LPC residue , index information from vector quantization or matrix quantization of a spectral envelope of the LPC residue and information indicating whether the input speech signal is voice or unvoiced .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs (decoding apparatus) , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe (fixed number) affected by the artificial construction of the periodic part .
US5749065A
CLAIM 13
. A method for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal which is divided on a time axis on a block basis , the short-term prediction residue being represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the method for decoding comprising the steps of : finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number (last subframe) of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing a time-axis waveform signal based on the short-term prediction residual waveform of the encoded speech signal .

US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (decoding apparatus) , limits to a given value a gain used for scaling the synthesized sound signal .
US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (decoding apparatus) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (signal parameters) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US5749065A
CLAIM 39
. The speech decoding apparatus as claimed in claim 38 , wherein the computation means outputs a linear predictive coding (LPC) residue as the short-term prediction residue , and wherein the synthesizing means employs as the encoded speech signal parameters (LP filter) respectively representing LPC coefficients , pitch information representing a basic period of the LPC residue , index information from vector quantization or matrix quantization of a spectral envelope of the LPC residue and information indicating whether the input speech signal is voice or unvoiced .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (signal parameters) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5749065A
CLAIM 39
. The speech decoding apparatus as claimed in claim 38 , wherein the computation means outputs a linear predictive coding (LPC) residue as the short-term prediction residue , and wherein the synthesizing means employs as the encoded speech signal parameters (LP filter) respectively representing LPC coefficients , pitch information representing a basic period of the LPC residue , index information from vector quantization or matrix quantization of a spectral envelope of the LPC residue and information indicating whether the input speech signal is voice or unvoiced .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery (decoding apparatus) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (signal parameters) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5749065A
CLAIM 38
. A speech decoding apparatus (decoder recovery, decoder constructs) for decoding an encoded speech signal formed using a short-term prediction residue of an input speech signal divided on a time axis on a block basis , the short-term prediction residue represented by a sum of sine waves on the block basis , wherein information of a frequency spectrum of the sum of the sine waves is encoded to form the encoded speech signal to be decoded , the decoding apparatus comprising : computation means for finding a short-term prediction residual waveform by sine wave synthesis of the encoded speech signal by converting a fixed number of data of the frequency spectrum into a variable number thereof , wherein the encoded speech signal is encoded by matrix quantization or vector quantization with weighting that takes into account factors relating to human hearing sense ;
and synthesizing means for synthesizing a time-axis waveform signal based on the short-term residual waveform .

US5749065A
CLAIM 39
. The speech decoding apparatus as claimed in claim 38 , wherein the computation means outputs a linear predictive coding (LPC) residue as the short-term prediction residue , and wherein the synthesizing means employs as the encoded speech signal parameters (LP filter) respectively representing LPC coefficients , pitch information representing a basic period of the LPC residue , index information from vector quantization or matrix quantization of a spectral envelope of the LPC residue and information indicating whether the input speech signal is voice or unvoiced .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5724433A

Filed: 1995-06-07     Issued: 1998-03-03

Adaptive gain and filtering circuit for a sound reproduction system

(Original Assignee) K/S Himpp     (Current Assignee) HIMPP K/S ; K/S Himpp

A. Maynard Engebretson, Michael P. O'Connell
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (filtered signal) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5724433A
CLAIM 26
. An adaptive gain amplifier circuit comprising a plurality of channels connected to a common output , each channel comprising : a filter with preset parameters for receiving an input signal in the audible frequency range for producing a filtered signal (first impulse) ;
a channel amplifier responsive to the filtered signal for producing a channel output signal ;
a channel gain register for storing a gain value ;
a channel preamplifier having a preset gain for amplifying the gain value to produce a gain signal ;
wherein the channel amplifier is responsive to the channel preamplifier for varying the gain of the channel amplifier as a function of the gain signal ;
means for establishing a channel threshold level for the channel output signal ;
and means , responsive to the channel output signal and the channel threshold level , for increasing the gain value up to a predetermined limit when the channel output signal falls below the channel threshold level and for decreasing the gain value when the channel output signal rises above the channel threshold level ;
wherein the channel output signals are combined to produce an adaptively compressed and filtered output signal .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5724433A
CLAIM 1
. An adaptive gain amplifier circuit comprising : an amplifier for receiving an input signal in the audible frequency range and producing an output signal ;
means for establishing a threshold level for the output signal ;
a comparator for producing a control signal (maximum amplitude) as a function of the level of the output signal being greater or less than the threshold level ;
a gain register for storing a gain setting ;
an adder responsive to the control signal for increasing the gain setting up to a predetermined limit when the output signal falls below the threshold level and for decreasing the gain setting when the output signal rises above the threshold level ;
and a preamplifier having a preset gain for amplifying the gain setting to produce a gain signal ;
wherein the amplifier is responsive to the preamplifier for varying the gain of the amplifier as a function of the gain signal , wherein the output signal is adaptively compressed .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5724433A
CLAIM 1
. An adaptive gain amplifier circuit comprising : an amplifier for receiving an input signal in the audible frequency range and producing an output signal ;
means for establishing a threshold level for the output signal ;
a comparator for producing a control signal (maximum amplitude) as a function of the level of the output signal being greater or less than the threshold level ;
a gain register for storing a gain setting ;
an adder responsive to the control signal for increasing the gain setting up to a predetermined limit when the output signal falls below the threshold level and for decreasing the gain setting when the output signal rises above the threshold level ;
and a preamplifier having a preset gain for amplifying the gain setting to produce a gain signal ;
wherein the amplifier is responsive to the preamplifier for varying the gain of the amplifier as a function of the gain signal , wherein the output signal is adaptively compressed .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (filtered signal) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5724433A
CLAIM 26
. An adaptive gain amplifier circuit comprising a plurality of channels connected to a common output , each channel comprising : a filter with preset parameters for receiving an input signal in the audible frequency range for producing a filtered signal (first impulse) ;
a channel amplifier responsive to the filtered signal for producing a channel output signal ;
a channel gain register for storing a gain value ;
a channel preamplifier having a preset gain for amplifying the gain value to produce a gain signal ;
wherein the channel amplifier is responsive to the channel preamplifier for varying the gain of the channel amplifier as a function of the gain signal ;
means for establishing a channel threshold level for the channel output signal ;
and means , responsive to the channel output signal and the channel threshold level , for increasing the gain value up to a predetermined limit when the channel output signal falls below the channel threshold level and for decreasing the gain value when the channel output signal rises above the channel threshold level ;
wherein the channel output signals are combined to produce an adaptively compressed and filtered output signal .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5724433A
CLAIM 1
. An adaptive gain amplifier circuit comprising : an amplifier for receiving an input signal in the audible frequency range and producing an output signal ;
means for establishing a threshold level for the output signal ;
a comparator for producing a control signal (maximum amplitude) as a function of the level of the output signal being greater or less than the threshold level ;
a gain register for storing a gain setting ;
an adder responsive to the control signal for increasing the gain setting up to a predetermined limit when the output signal falls below the threshold level and for decreasing the gain setting when the output signal rises above the threshold level ;
and a preamplifier having a preset gain for amplifying the gain setting to produce a gain signal ;
wherein the amplifier is responsive to the preamplifier for varying the gain of the amplifier as a function of the gain signal , wherein the output signal is adaptively compressed .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5724433A
CLAIM 1
. An adaptive gain amplifier circuit comprising : an amplifier for receiving an input signal in the audible frequency range and producing an output signal ;
means for establishing a threshold level for the output signal ;
a comparator for producing a control signal (maximum amplitude) as a function of the level of the output signal being greater or less than the threshold level ;
a gain register for storing a gain setting ;
an adder responsive to the control signal for increasing the gain setting up to a predetermined limit when the output signal falls below the threshold level and for decreasing the gain setting when the output signal rises above the threshold level ;
and a preamplifier having a preset gain for amplifying the gain setting to produce a gain signal ;
wherein the amplifier is responsive to the preamplifier for varying the gain of the amplifier as a function of the gain signal , wherein the output signal is adaptively compressed .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5668925A

Filed: 1995-06-01     Issued: 1997-09-16

Low data rate speech encoder with mixed excitation

(Original Assignee) Martin Marietta Corp     (Current Assignee) RETRO REFLECTIVE OPTICS

Joseph Harvey Rothweiler, John Charles Carmody, Srinivas Nandkumar
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (said signals) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (generating codewords) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (average pitch value) from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5668925A
CLAIM 4
. A method for coding digital temporally related speech signals including spectra , comprising the steps of : providing memorized monotonic spectrum values identified by codewords ;
dividing said speech signals into nonoverlapping blocks , each of which includes said spectra ;
taking the differences between a lower set of said spectra in one of said blocks and the remaining signals in said block , to generate difference signals ;
comparing said difference signals and said one signal in each of said blocks with said memorized values ;
in response to said comparisons , assigning to each of said difference signals a codeword representing that one of said memorized signals which is the closest match to that one of said difference signals ;
in response to said comparisons , assigning to said one of said signals (onset frame) in each of said blocks a codeword representing that one of said memorized signals which is the closest match to said one of said signals ;
generating a combination codeword for each of said blocks by product coding .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5668925A
CLAIM 3
. A method for transmitting information in the form of speech signal (speech signal, decoder determines concealment) s over a limited-data-rate data path , comprising the steps of : separating those portions of input speech signals containing jitter from those portions which do not contain jitter , to thereby produce (a) jittering speech signals containing varying pitch intervals , and (b) non-jittering speech signals ;
determining , on a frame-by-frame basis , the variation in the pitch intervals in said jittering speech signals ;
comparing said variation with a threshold ;
generating a particular state of a one-bit jitter signal when said variation exceeds said threshold , and generating the other state otherwise ;
transmitting said one-bit jitter signal over said data path to produce a transmitted jitter signal ;
generating a pitch signal , defining pitch intervals , at the receiving end of said data path ;
and when said transmitted jitter signal is in said particular state , randomly varying said pitch intervals of said pitch signal .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5668925A
CLAIM 3
. A method for transmitting information in the form of speech signal (speech signal, decoder determines concealment) s over a limited-data-rate data path , comprising the steps of : separating those portions of input speech signals containing jitter from those portions which do not contain jitter , to thereby produce (a) jittering speech signals containing varying pitch intervals , and (b) non-jittering speech signals ;
determining , on a frame-by-frame basis , the variation in the pitch intervals in said jittering speech signals ;
comparing said variation with a threshold ;
generating a particular state of a one-bit jitter signal when said variation exceeds said threshold , and generating the other state otherwise ;
transmitting said one-bit jitter signal over said data path to produce a transmitted jitter signal ;
generating a pitch signal , defining pitch intervals , at the receiving end of said data path ;
and when said transmitted jitter signal is in said particular state , randomly varying said pitch intervals of said pitch signal .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5668925A
CLAIM 3
. A method for transmitting information in the form of speech signal (speech signal, decoder determines concealment) s over a limited-data-rate data path , comprising the steps of : separating those portions of input speech signals containing jitter from those portions which do not contain jitter , to thereby produce (a) jittering speech signals containing varying pitch intervals , and (b) non-jittering speech signals ;
determining , on a frame-by-frame basis , the variation in the pitch intervals in said jittering speech signals ;
comparing said variation with a threshold ;
generating a particular state of a one-bit jitter signal when said variation exceeds said threshold , and generating the other state otherwise ;
transmitting said one-bit jitter signal over said data path to produce a transmitted jitter signal ;
generating a pitch signal , defining pitch intervals , at the receiving end of said data path ;
and when said transmitted jitter signal is in said particular state , randomly varying said pitch intervals of said pitch signal .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (said signals) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (generating codewords) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (average pitch value) from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5668925A
CLAIM 4
. A method for coding digital temporally related speech signals including spectra , comprising the steps of : providing memorized monotonic spectrum values identified by codewords ;
dividing said speech signals into nonoverlapping blocks , each of which includes said spectra ;
taking the differences between a lower set of said spectra in one of said blocks and the remaining signals in said block , to generate difference signals ;
comparing said difference signals and said one signal in each of said blocks with said memorized values ;
in response to said comparisons , assigning to each of said difference signals a codeword representing that one of said memorized signals which is the closest match to that one of said difference signals ;
in response to said comparisons , assigning to said one of said signals (onset frame) in each of said blocks a codeword representing that one of said memorized signals which is the closest match to said one of said signals ;
generating a combination codeword for each of said blocks by product coding .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5668925A
CLAIM 3
. A method for transmitting information in the form of speech signal (speech signal, decoder determines concealment) s over a limited-data-rate data path , comprising the steps of : separating those portions of input speech signals containing jitter from those portions which do not contain jitter , to thereby produce (a) jittering speech signals containing varying pitch intervals , and (b) non-jittering speech signals ;
determining , on a frame-by-frame basis , the variation in the pitch intervals in said jittering speech signals ;
comparing said variation with a threshold ;
generating a particular state of a one-bit jitter signal when said variation exceeds said threshold , and generating the other state otherwise ;
transmitting said one-bit jitter signal over said data path to produce a transmitted jitter signal ;
generating a pitch signal , defining pitch intervals , at the receiving end of said data path ;
and when said transmitted jitter signal is in said particular state , randomly varying said pitch intervals of said pitch signal .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5668925A
CLAIM 3
. A method for transmitting information in the form of speech signal (speech signal, decoder determines concealment) s over a limited-data-rate data path , comprising the steps of : separating those portions of input speech signals containing jitter from those portions which do not contain jitter , to thereby produce (a) jittering speech signals containing varying pitch intervals , and (b) non-jittering speech signals ;
determining , on a frame-by-frame basis , the variation in the pitch intervals in said jittering speech signals ;
comparing said variation with a threshold ;
generating a particular state of a one-bit jitter signal when said variation exceeds said threshold , and generating the other state otherwise ;
transmitting said one-bit jitter signal over said data path to produce a transmitted jitter signal ;
generating a pitch signal , defining pitch intervals , at the receiving end of said data path ;
and when said transmitted jitter signal is in said particular state , randomly varying said pitch intervals of said pitch signal .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5668925A
CLAIM 3
. A method for transmitting information in the form of speech signal (speech signal, decoder determines concealment) s over a limited-data-rate data path , comprising the steps of : separating those portions of input speech signals containing jitter from those portions which do not contain jitter , to thereby produce (a) jittering speech signals containing varying pitch intervals , and (b) non-jittering speech signals ;
determining , on a frame-by-frame basis , the variation in the pitch intervals in said jittering speech signals ;
comparing said variation with a threshold ;
generating a particular state of a one-bit jitter signal when said variation exceeds said threshold , and generating the other state otherwise ;
transmitting said one-bit jitter signal over said data path to produce a transmitted jitter signal ;
generating a pitch signal , defining pitch intervals , at the receiving end of said data path ;
and when said transmitted jitter signal is in said particular state , randomly varying said pitch intervals of said pitch signal .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5668925A
CLAIM 3
. A method for transmitting information in the form of speech signal (speech signal, decoder determines concealment) s over a limited-data-rate data path , comprising the steps of : separating those portions of input speech signals containing jitter from those portions which do not contain jitter , to thereby produce (a) jittering speech signals containing varying pitch intervals , and (b) non-jittering speech signals ;
determining , on a frame-by-frame basis , the variation in the pitch intervals in said jittering speech signals ;
comparing said variation with a threshold ;
generating a particular state of a one-bit jitter signal when said variation exceeds said threshold , and generating the other state otherwise ;
transmitting said one-bit jitter signal over said data path to produce a transmitted jitter signal ;
generating a pitch signal , defining pitch intervals , at the receiving end of said data path ;
and when said transmitted jitter signal is in said particular state , randomly varying said pitch intervals of said pitch signal .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5781880A

Filed: 1995-05-30     Issued: 1998-07-14

Pitch lag estimation using frequency-domain lowpass filtering of the linear predictive coding (LPC) residual

(Original Assignee) Rockwell International Corp     (Current Assignee) ROCKWELLSCIENCE CENTER Inc ; WIAV Solutions LLC

Huan-Yu Su
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe (output means) affected by the artificial construction of the periodic part .
US5781880A
CLAIM 8
. The system of claim 1 , further comprising : speech input means for receiving the input speech ;
means for determining the LPC residual signal of the input speech ;
a computer for processing the initial pitch lag value to reproduce the LPC residual signal as coded speech ;
and speech output means (last subframe, last frame) for outputting the coded speech .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (point D) per sample for other frames .
US5781880A
CLAIM 9
. A system operable with a computer for estimating pitch lag for input speech quantization and compression requiring substantially reduced complexity on the order of three times less complexity than standard pitch detection methods , the speech having a linear predictive coding (LPC) residual signal defined by a plurality of LPC residual samples , wherein the estimated pitch lag falls within a predetermined minimum and maximum pitch lag value range , further wherein the speech represents voiced and unvoiced speech within a typical frequency range having a fundamental frequency , the system comprising : means for selecting a pitch analysis window among the LPC residual samples , the pitch analysis window being at least twice as large as the maximum pitch lag value ;
means for applying a first discrete Fourier transform (DFT) to the windowed plurality of LPC residual samples , the first DFT having an associated amplitude spectrum , the amplitude spectrum having low and high frequency components ;
a filter for filtering out the high frequency components of the amplitude spectrum in the frequency domain , thereby providing for substantially reduced system complexity , wherein frequencies between zero and at least two times the typical frequency range of the speech are retained to ensure that at least one harmonic is detected to prevent confusion in detecting the fundamental frequency ;
means for applying a second DFT directly over the amplitude spectrum of the first DFT without taking the logarithm of the squared amplitude , the second DFT being a 256-point D (average energy) FT and having associated quasi-time domain-transformed samples such that the quasi-time domain-transformed samples are real values ;
means for applying a weighted average to the time domain-transformed samples , wherein at least two samples are combined to produce a single sample ;
means for searching the time-domain transformed speech samples to find at least one sample having a maximum peak value ;
and means for estimating an initial pitch lag value according to the sample having the maximum peak value .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (output means) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5781880A
CLAIM 8
. The system of claim 1 , further comprising : speech input means for receiving the input speech ;
means for determining the LPC residual signal of the input speech ;
a computer for processing the initial pitch lag value to reproduce the LPC residual signal as coded speech ;
and speech output means (last subframe, last frame) for outputting the coded speech .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (output means) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5781880A
CLAIM 8
. The system of claim 1 , further comprising : speech input means for receiving the input speech ;
means for determining the LPC residual signal of the input speech ;
a computer for processing the initial pitch lag value to reproduce the LPC residual signal as coded speech ;
and speech output means (last subframe, last frame) for outputting the coded speech .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (weighted average) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5781880A
CLAIM 9
. A system operable with a computer for estimating pitch lag for input speech quantization and compression requiring substantially reduced complexity on the order of three times less complexity than standard pitch detection methods , the speech having a linear predictive coding (LPC) residual signal defined by a plurality of LPC residual samples , wherein the estimated pitch lag falls within a predetermined minimum and maximum pitch lag value range , further wherein the speech represents voiced and unvoiced speech within a typical frequency range having a fundamental frequency , the system comprising : means for selecting a pitch analysis window among the LPC residual samples , the pitch analysis window being at least twice as large as the maximum pitch lag value ;
means for applying a first discrete Fourier transform (DFT) to the windowed plurality of LPC residual samples , the first DFT having an associated amplitude spectrum , the amplitude spectrum having low and high frequency components ;
a filter for filtering out the high frequency components of the amplitude spectrum in the frequency domain , thereby providing for substantially reduced system complexity , wherein frequencies between zero and at least two times the typical frequency range of the speech are retained to ensure that at least one harmonic is detected to prevent confusion in detecting the fundamental frequency ;
means for applying a second DFT directly over the amplitude spectrum of the first DFT without taking the logarithm of the squared amplitude , the second DFT being a 256-point DFT and having associated quasi-time domain-transformed samples such that the quasi-time domain-transformed samples are real values ;
means for applying a weighted average (E q) to the time domain-transformed samples , wherein at least two samples are combined to produce a single sample ;
means for searching the time-domain transformed speech samples to find at least one sample having a maximum peak value ;
and means for estimating an initial pitch lag value according to the sample having the maximum peak value .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (output means) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (weighted average) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5781880A
CLAIM 8
. The system of claim 1 , further comprising : speech input means for receiving the input speech ;
means for determining the LPC residual signal of the input speech ;
a computer for processing the initial pitch lag value to reproduce the LPC residual signal as coded speech ;
and speech output means (last subframe, last frame) for outputting the coded speech .

US5781880A
CLAIM 9
. A system operable with a computer for estimating pitch lag for input speech quantization and compression requiring substantially reduced complexity on the order of three times less complexity than standard pitch detection methods , the speech having a linear predictive coding (LPC) residual signal defined by a plurality of LPC residual samples , wherein the estimated pitch lag falls within a predetermined minimum and maximum pitch lag value range , further wherein the speech represents voiced and unvoiced speech within a typical frequency range having a fundamental frequency , the system comprising : means for selecting a pitch analysis window among the LPC residual samples , the pitch analysis window being at least twice as large as the maximum pitch lag value ;
means for applying a first discrete Fourier transform (DFT) to the windowed plurality of LPC residual samples , the first DFT having an associated amplitude spectrum , the amplitude spectrum having low and high frequency components ;
a filter for filtering out the high frequency components of the amplitude spectrum in the frequency domain , thereby providing for substantially reduced system complexity , wherein frequencies between zero and at least two times the typical frequency range of the speech are retained to ensure that at least one harmonic is detected to prevent confusion in detecting the fundamental frequency ;
means for applying a second DFT directly over the amplitude spectrum of the first DFT without taking the logarithm of the squared amplitude , the second DFT being a 256-point DFT and having associated quasi-time domain-transformed samples such that the quasi-time domain-transformed samples are real values ;
means for applying a weighted average (E q) to the time domain-transformed samples , wherein at least two samples are combined to produce a single sample ;
means for searching the time-domain transformed speech samples to find at least one sample having a maximum peak value ;
and means for estimating an initial pitch lag value according to the sample having the maximum peak value .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe (output means) affected by the artificial construction of the periodic part .
US5781880A
CLAIM 8
. The system of claim 1 , further comprising : speech input means for receiving the input speech ;
means for determining the LPC residual signal of the input speech ;
a computer for processing the initial pitch lag value to reproduce the LPC residual signal as coded speech ;
and speech output means (last subframe, last frame) for outputting the coded speech .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (point D) per sample for other frames .
US5781880A
CLAIM 9
. A system operable with a computer for estimating pitch lag for input speech quantization and compression requiring substantially reduced complexity on the order of three times less complexity than standard pitch detection methods , the speech having a linear predictive coding (LPC) residual signal defined by a plurality of LPC residual samples , wherein the estimated pitch lag falls within a predetermined minimum and maximum pitch lag value range , further wherein the speech represents voiced and unvoiced speech within a typical frequency range having a fundamental frequency , the system comprising : means for selecting a pitch analysis window among the LPC residual samples , the pitch analysis window being at least twice as large as the maximum pitch lag value ;
means for applying a first discrete Fourier transform (DFT) to the windowed plurality of LPC residual samples , the first DFT having an associated amplitude spectrum , the amplitude spectrum having low and high frequency components ;
a filter for filtering out the high frequency components of the amplitude spectrum in the frequency domain , thereby providing for substantially reduced system complexity , wherein frequencies between zero and at least two times the typical frequency range of the speech are retained to ensure that at least one harmonic is detected to prevent confusion in detecting the fundamental frequency ;
means for applying a second DFT directly over the amplitude spectrum of the first DFT without taking the logarithm of the squared amplitude , the second DFT being a 256-point D (average energy) FT and having associated quasi-time domain-transformed samples such that the quasi-time domain-transformed samples are real values ;
means for applying a weighted average to the time domain-transformed samples , wherein at least two samples are combined to produce a single sample ;
means for searching the time-domain transformed speech samples to find at least one sample having a maximum peak value ;
and means for estimating an initial pitch lag value according to the sample having the maximum peak value .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (output means) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5781880A
CLAIM 8
. The system of claim 1 , further comprising : speech input means for receiving the input speech ;
means for determining the LPC residual signal of the input speech ;
a computer for processing the initial pitch lag value to reproduce the LPC residual signal as coded speech ;
and speech output means (last subframe, last frame) for outputting the coded speech .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (output means) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5781880A
CLAIM 8
. The system of claim 1 , further comprising : speech input means for receiving the input speech ;
means for determining the LPC residual signal of the input speech ;
a computer for processing the initial pitch lag value to reproduce the LPC residual signal as coded speech ;
and speech output means (last subframe, last frame) for outputting the coded speech .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (weighted average) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5781880A
CLAIM 9
. A system operable with a computer for estimating pitch lag for input speech quantization and compression requiring substantially reduced complexity on the order of three times less complexity than standard pitch detection methods , the speech having a linear predictive coding (LPC) residual signal defined by a plurality of LPC residual samples , wherein the estimated pitch lag falls within a predetermined minimum and maximum pitch lag value range , further wherein the speech represents voiced and unvoiced speech within a typical frequency range having a fundamental frequency , the system comprising : means for selecting a pitch analysis window among the LPC residual samples , the pitch analysis window being at least twice as large as the maximum pitch lag value ;
means for applying a first discrete Fourier transform (DFT) to the windowed plurality of LPC residual samples , the first DFT having an associated amplitude spectrum , the amplitude spectrum having low and high frequency components ;
a filter for filtering out the high frequency components of the amplitude spectrum in the frequency domain , thereby providing for substantially reduced system complexity , wherein frequencies between zero and at least two times the typical frequency range of the speech are retained to ensure that at least one harmonic is detected to prevent confusion in detecting the fundamental frequency ;
means for applying a second DFT directly over the amplitude spectrum of the first DFT without taking the logarithm of the squared amplitude , the second DFT being a 256-point DFT and having associated quasi-time domain-transformed samples such that the quasi-time domain-transformed samples are real values ;
means for applying a weighted average (E q) to the time domain-transformed samples , wherein at least two samples are combined to produce a single sample ;
means for searching the time-domain transformed speech samples to find at least one sample having a maximum peak value ;
and means for estimating an initial pitch lag value according to the sample having the maximum peak value .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (point D) per sample for other frames .
US5781880A
CLAIM 9
. A system operable with a computer for estimating pitch lag for input speech quantization and compression requiring substantially reduced complexity on the order of three times less complexity than standard pitch detection methods , the speech having a linear predictive coding (LPC) residual signal defined by a plurality of LPC residual samples , wherein the estimated pitch lag falls within a predetermined minimum and maximum pitch lag value range , further wherein the speech represents voiced and unvoiced speech within a typical frequency range having a fundamental frequency , the system comprising : means for selecting a pitch analysis window among the LPC residual samples , the pitch analysis window being at least twice as large as the maximum pitch lag value ;
means for applying a first discrete Fourier transform (DFT) to the windowed plurality of LPC residual samples , the first DFT having an associated amplitude spectrum , the amplitude spectrum having low and high frequency components ;
a filter for filtering out the high frequency components of the amplitude spectrum in the frequency domain , thereby providing for substantially reduced system complexity , wherein frequencies between zero and at least two times the typical frequency range of the speech are retained to ensure that at least one harmonic is detected to prevent confusion in detecting the fundamental frequency ;
means for applying a second DFT directly over the amplitude spectrum of the first DFT without taking the logarithm of the squared amplitude , the second DFT being a 256-point D (average energy) FT and having associated quasi-time domain-transformed samples such that the quasi-time domain-transformed samples are real values ;
means for applying a weighted average to the time domain-transformed samples , wherein at least two samples are combined to produce a single sample ;
means for searching the time-domain transformed speech samples to find at least one sample having a maximum peak value ;
and means for estimating an initial pitch lag value according to the sample having the maximum peak value .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (output means) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (weighted average) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5781880A
CLAIM 8
. The system of claim 1 , further comprising : speech input means for receiving the input speech ;
means for determining the LPC residual signal of the input speech ;
a computer for processing the initial pitch lag value to reproduce the LPC residual signal as coded speech ;
and speech output means (last subframe, last frame) for outputting the coded speech .

US5781880A
CLAIM 9
. A system operable with a computer for estimating pitch lag for input speech quantization and compression requiring substantially reduced complexity on the order of three times less complexity than standard pitch detection methods , the speech having a linear predictive coding (LPC) residual signal defined by a plurality of LPC residual samples , wherein the estimated pitch lag falls within a predetermined minimum and maximum pitch lag value range , further wherein the speech represents voiced and unvoiced speech within a typical frequency range having a fundamental frequency , the system comprising : means for selecting a pitch analysis window among the LPC residual samples , the pitch analysis window being at least twice as large as the maximum pitch lag value ;
means for applying a first discrete Fourier transform (DFT) to the windowed plurality of LPC residual samples , the first DFT having an associated amplitude spectrum , the amplitude spectrum having low and high frequency components ;
a filter for filtering out the high frequency components of the amplitude spectrum in the frequency domain , thereby providing for substantially reduced system complexity , wherein frequencies between zero and at least two times the typical frequency range of the speech are retained to ensure that at least one harmonic is detected to prevent confusion in detecting the fundamental frequency ;
means for applying a second DFT directly over the amplitude spectrum of the first DFT without taking the logarithm of the squared amplitude , the second DFT being a 256-point DFT and having associated quasi-time domain-transformed samples such that the quasi-time domain-transformed samples are real values ;
means for applying a weighted average (E q) to the time domain-transformed samples , wherein at least two samples are combined to produce a single sample ;
means for searching the time-domain transformed speech samples to find at least one sample having a maximum peak value ;
and means for estimating an initial pitch lag value according to the sample having the maximum peak value .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
JPH08263098A

Filed: 1995-03-28     Issued: 1996-10-11

音響信号符号化方法、音響信号復号化方法

(Original Assignee) Nippon Telegr & Teleph Corp <Ntt>; 日本電信電話株式会社     

Kazunaga Ikeda, Naoki Iwagami, Satoshi Miki, Takehiro Moriya, 聡 三樹, 健弘 守谷, 直樹 岩上, 和永 池田
US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH08263098A
CLAIM 4
【請求項4】 入力音響信号を一定時間間隔でフレーム に分割し、そのフレームごとに符号化する音響信号符号 化方法において、 各フレームごとに、入力音響信号を時間領域でベクトル 量子化する第1符号化法により符号化する第1符号化過 程と、 各フレームごとに、入力音響信号を周波数領域でベクト ル量子化する第2符号化法により符号化する第2符号化 過程と、 上記第1符号化過程による符号化符号と、上記第2符号 化過程による符号化符号とのうち、符号化歪が小さい方 を選択して、その信号符号化符号と、その符号化法を示 す符号化法符号とを出力する過程と、 を有すること (maximum amplitude) を特徴とする音響信号符号化方法。

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy (各フレームごと) for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
JPH08263098A
CLAIM 1
【請求項1】 入力音響信号を一定時間間隔でフレーム に分割し、そのフレームごとに符号化する音響信号符号 化方法において、 各フレームごと (signal energy) に、入力音響信号を分析して第1符号化 法と第2符号化法のいずれが適するかを決定する符号化 法決定過程と、 その符号化法決定過程で上記第1符号化法が適すると決 定されると、上記入力音響信号を時間領域でベクトル量 子化して信号符号化符号および上記第1符号化法の選択 を示す符号化法符号を出力する第1符号化過程と、 上記符号化法決定過程で上記第2符号化法が適すると決 定されると、上記入力音響信号を周波数領域でベクトル 量子化して信号符号化符号および上記第2符号化法の選 択を示す符号化法符号を出力する第2符号化過程と、 を有する音響信号符号化方法。

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
JPH08263098A
CLAIM 4
【請求項4】 入力音響信号を一定時間間隔でフレーム に分割し、そのフレームごとに符号化する音響信号符号 化方法において、 各フレームごとに、入力音響信号を時間領域でベクトル 量子化する第1符号化法により符号化する第1符号化過 程と、 各フレームごとに、入力音響信号を周波数領域でベクト ル量子化する第2符号化法により符号化する第2符号化 過程と、 上記第1符号化過程による符号化符号と、上記第2符号 化過程による符号化符号とのうち、符号化歪が小さい方 を選択して、その信号符号化符号と、その符号化法を示 す符号化法符号とを出力する過程と、 を有すること (maximum amplitude) を特徴とする音響信号符号化方法。

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH08263098A
CLAIM 4
【請求項4】 入力音響信号を一定時間間隔でフレーム に分割し、そのフレームごとに符号化する音響信号符号 化方法において、 各フレームごとに、入力音響信号を時間領域でベクトル 量子化する第1符号化法により符号化する第1符号化過 程と、 各フレームごとに、入力音響信号を周波数領域でベクト ル量子化する第2符号化法により符号化する第2符号化 過程と、 上記第1符号化過程による符号化符号と、上記第2符号 化過程による符号化符号とのうち、符号化歪が小さい方 を選択して、その信号符号化符号と、その符号化法を示 す符号化法符号とを出力する過程と、 を有すること (maximum amplitude) を特徴とする音響信号符号化方法。

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (各フレームごと) for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH08263098A
CLAIM 1
【請求項1】 入力音響信号を一定時間間隔でフレーム に分割し、そのフレームごとに符号化する音響信号符号 化方法において、 各フレームごと (signal energy) に、入力音響信号を分析して第1符号化 法と第2符号化法のいずれが適するかを決定する符号化 法決定過程と、 その符号化法決定過程で上記第1符号化法が適すると決 定されると、上記入力音響信号を時間領域でベクトル量 子化して信号符号化符号および上記第1符号化法の選択 を示す符号化法符号を出力する第1符号化過程と、 上記符号化法決定過程で上記第2符号化法が適すると決 定されると、上記入力音響信号を周波数領域でベクトル 量子化して信号符号化符号および上記第2符号化法の選 択を示す符号化法符号を出力する第2符号化過程と、 を有する音響信号符号化方法。

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (有すること) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
JPH08263098A
CLAIM 4
【請求項4】 入力音響信号を一定時間間隔でフレーム に分割し、そのフレームごとに符号化する音響信号符号 化方法において、 各フレームごとに、入力音響信号を時間領域でベクトル 量子化する第1符号化法により符号化する第1符号化過 程と、 各フレームごとに、入力音響信号を周波数領域でベクト ル量子化する第2符号化法により符号化する第2符号化 過程と、 上記第1符号化過程による符号化符号と、上記第2符号 化過程による符号化符号とのうち、符号化歪が小さい方 を選択して、その信号符号化符号と、その符号化法を示 す符号化法符号とを出力する過程と、 を有すること (maximum amplitude) を特徴とする音響信号符号化方法。

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy (各フレームごと) for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
JPH08263098A
CLAIM 1
【請求項1】 入力音響信号を一定時間間隔でフレーム に分割し、そのフレームごとに符号化する音響信号符号 化方法において、 各フレームごと (signal energy) に、入力音響信号を分析して第1符号化 法と第2符号化法のいずれが適するかを決定する符号化 法決定過程と、 その符号化法決定過程で上記第1符号化法が適すると決 定されると、上記入力音響信号を時間領域でベクトル量 子化して信号符号化符号および上記第1符号化法の選択 を示す符号化法符号を出力する第1符号化過程と、 上記符号化法決定過程で上記第2符号化法が適すると決 定されると、上記入力音響信号を周波数領域でベクトル 量子化して信号符号化符号および上記第2符号化法の選 択を示す符号化法符号を出力する第2符号化過程と、 を有する音響信号符号化方法。




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5699478A

Filed: 1995-03-10     Issued: 1997-12-16

Frame erasure compensation technique

(Original Assignee) Nokia of America Corp     (Current Assignee) Nokia of America Corp

Dror Nahumi
US7693710B2
CLAIM 1
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (frame erasure) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (frame erasure) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number (last non) of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) , and a phase information parameter (coded representation) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number) erased frame received before the frame erasure (frame erasure) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number (last non) of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (frame erasure) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number (last non) of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (frame erasure) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (frame erasure) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non (predetermined number) erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number (last non) of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number) erased frame received before the frame erasure (frame erasure) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number (last non) of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (frame erasure) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter (coded representation) and a phase information parameter (coded representation) related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non (predetermined number) erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5699478A
CLAIM 1
. In a speech coding system for coding speech into a plurality of sequential frames and having a memory table associating each of a plurality of coded speech representations with a corresponding parameter set consisting of a plurality of speech parameters , an error compensation method comprising the following steps : (a) incorporating into each sequential frame a delta parameter specifying the amount by which one of the plurality of speech parameters changes from a given sequential frame to a frame preceding the given sequential frame by a predetermined number (last non) of frames ;
and (b) upon the occurrence of a frame erasure (frame erasure) , updating the memory table based upon the delta parameter of the frame succeeding the erased frame by the predetermined number of frames .

US5699478A
CLAIM 3
. A speech coding method including the following steps : (a) representing speech using a plurality of sequential frames including a present frame and a previous frame , each frame having a predetermined number of bits for representing each of a plurality of speech parameters ;
the plurality of speech parameters comprising a speech parameter set ;
(b) including a delta parameter in the present frame indicative of the change in one of the plurality of speech parameters from the present frame to the previous frame ;
(c) storing a code table in memory associating each of a plurality of speech parameter sets with corresponding digitally coded representation (energy information parameter, phase information parameter) s of speech ;
the code table being updated subsequent to the receipt of each new parameter set ;
(d) using the delta parameter to update the code table subsequent to the occurrence of a frame erasure .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5884010A

Filed: 1995-02-16     Issued: 1999-03-16

Linear prediction coefficient generation during frame erasure or packet loss

(Original Assignee) Nokia of America Corp     (Current Assignee) Evonik Goldschmidt GmbH ; Nokia of America Corp

Juin-Hwey Chen, Craig Robert Watkins
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (signal samples) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5884010A
CLAIM 1
. A method of synthesizing a signal reflecting human speech , the method for use by a decoder which experiences an erasure of input bits , the decoder including a first excitation signal generator responsive to said input bits and a synthesis filter responsive to an excitation signal , the method comprising the steps of : storing , in a memory , samples of a first excitation signal generated by said first excitation signal generator ;
responsive to a signal indicating the erasure of input bits , synthesizing a second excitation signal based on previously stored samples of the first excitation signal ;
and filtering said second excitation signal to synthesize said signal reflecting human speech ;
wherein the step of synthesizing a second excitation signal includes the steps of : correlating a first subset of samples stored in said memory with a second subset of samples stored in said memory , at least one of said samples in said second subset being earlier in said memory than any sample in said first subset ;
identifying a set of stored excitation signal samples (impulse responses) based on said correlation of said first and second subsets ;
forming said second excitation signal based on said identified set of excitation signal samples .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5884010A
CLAIM 6
. The method of claim 1 wherein : the step of correlating comprises determining a time lag value between first and second subsets of samples corresponding to a maximum correlation ;
and the step of identifying a set of stored excitation signal samples comprises i (LP filter) dentifying said samples based on said time lag value .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5884010A
CLAIM 6
. The method of claim 1 wherein : the step of correlating comprises determining a time lag value between first and second subsets of samples corresponding to a maximum correlation ;
and the step of identifying a set of stored excitation signal samples comprises i (LP filter) dentifying said samples based on said time lag value .

US5884010A
CLAIM 8
. The method of claim 7 wherein said test comprises comparing a weight of a signal tap pitch p (E q) redicator to a threshold .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (time lag) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5884010A
CLAIM 6
. The method of claim 1 wherein : the step of correlating comprises determining a time lag (replacement frame) value between first and second subsets of samples corresponding to a maximum correlation ;
and the step of identifying a set of stored excitation signal samples comprises i (LP filter) dentifying said samples based on said time lag value .

US5884010A
CLAIM 8
. The method of claim 7 wherein said test comprises comparing a weight of a signal tap pitch p (E q) redicator to a threshold .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (signal samples) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5884010A
CLAIM 1
. A method of synthesizing a signal reflecting human speech , the method for use by a decoder which experiences an erasure of input bits , the decoder including a first excitation signal generator responsive to said input bits and a synthesis filter responsive to an excitation signal , the method comprising the steps of : storing , in a memory , samples of a first excitation signal generated by said first excitation signal generator ;
responsive to a signal indicating the erasure of input bits , synthesizing a second excitation signal based on previously stored samples of the first excitation signal ;
and filtering said second excitation signal to synthesize said signal reflecting human speech ;
wherein the step of synthesizing a second excitation signal includes the steps of : correlating a first subset of samples stored in said memory with a second subset of samples stored in said memory , at least one of said samples in said second subset being earlier in said memory than any sample in said first subset ;
identifying a set of stored excitation signal samples (impulse responses) based on said correlation of said first and second subsets ;
forming said second excitation signal based on said identified set of excitation signal samples .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5884010A
CLAIM 6
. The method of claim 1 wherein : the step of correlating comprises determining a time lag value between first and second subsets of samples corresponding to a maximum correlation ;
and the step of identifying a set of stored excitation signal samples comprises i (LP filter) dentifying said samples based on said time lag value .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5884010A
CLAIM 6
. The method of claim 1 wherein : the step of correlating comprises determining a time lag value between first and second subsets of samples corresponding to a maximum correlation ;
and the step of identifying a set of stored excitation signal samples comprises i (LP filter) dentifying said samples based on said time lag value .

US5884010A
CLAIM 8
. The method of claim 7 wherein said test comprises comparing a weight of a signal tap pitch p (E q) redicator to a threshold .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (time lag) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (pitch p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5884010A
CLAIM 6
. The method of claim 1 wherein : the step of correlating comprises determining a time lag (replacement frame) value between first and second subsets of samples corresponding to a maximum correlation ;
and the step of identifying a set of stored excitation signal samples comprises i (LP filter) dentifying said samples based on said time lag value .

US5884010A
CLAIM 8
. The method of claim 7 wherein said test comprises comparing a weight of a signal tap pitch p (E q) redicator to a threshold .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
EP0691751A1

Filed: 1994-11-29     Issued: 1996-01-10

Method and device for compressing information, method and device for expanding compressed information, device for recording/transmitting compressed information, device for receiving compressed information, and recording medium

(Original Assignee) Sony Corp     (Current Assignee) Sony Corp

Makoto Mitsuno
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (time length) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
EP0691751A1
CLAIM 24
The method as claimed in any one of claims 1 to 25 wherein , when determining the time length (pitch period) of the processing block using changes in the input signal of the processing block under consideration , the boundary value is variable dependent on the amplitude and frequency of the input signal .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (signal component, based signal) within a pitch period (time length) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP0691751A1
CLAIM 23
The method as claimed in claim 22 wherein allocation of the main information and/or subsidiary information is inhibited for signal component (maximum amplitude) s of a band approximately higher than the signal pass band .

EP0691751A1
CLAIM 24
The method as claimed in any one of claims 1 to 25 wherein , when determining the time length (pitch period) of the processing block using changes in the input signal of the processing block under consideration , the boundary value is variable dependent on the amplitude and frequency of the input signal .

EP0691751A1
CLAIM 32
An apparatus for information compaction comprising    block dividing means for dividing input signals with at least two channels into processing blocks , with the processing block length being varied depending on the input signals of the respective channels and with the lengths of the concurrent processing blocks of the respective channels being the same , and    means for compacting the information for processing block based signal (maximum amplitude) s .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy (rising time) of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
EP0691751A1
CLAIM 41
The apparatus as claimed in any one of claims 32 to 40 comprising time (controlling energy) -domain signal band dividing means for dividing the time-domain signal into plural bands , and orthogonal transform means for transforming the time-domain signals of the respective bands into plural bands on the frequency domain , wherein the block dividing means form processing blocks each consisting of plural samples for bands divided by the band dividing means , said orthogonal transform means executing orthogonal transform for each processing block for producing coefficient data , said information compacting means compacting the coefficient data for the processing blocks .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (signal component, based signal) within a pitch period (time length) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
EP0691751A1
CLAIM 23
The method as claimed in claim 22 wherein allocation of the main information and/or subsidiary information is inhibited for signal component (maximum amplitude) s of a band approximately higher than the signal pass band .

EP0691751A1
CLAIM 24
The method as claimed in any one of claims 1 to 25 wherein , when determining the time length (pitch period) of the processing block using changes in the input signal of the processing block under consideration , the boundary value is variable dependent on the amplitude and frequency of the input signal .

EP0691751A1
CLAIM 32
An apparatus for information compaction comprising    block dividing means for dividing input signals with at least two channels into processing blocks , with the processing block length being varied depending on the input signals of the respective channels and with the lengths of the concurrent processing blocks of the respective channels being the same , and    means for compacting the information for processing block based signal (maximum amplitude) s .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (time length) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
EP0691751A1
CLAIM 24
The method as claimed in any one of claims 1 to 25 wherein , when determining the time length (pitch period) of the processing block using changes in the input signal of the processing block under consideration , the boundary value is variable dependent on the amplitude and frequency of the input signal .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (signal component, based signal) within a pitch period (time length) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP0691751A1
CLAIM 23
The method as claimed in claim 22 wherein allocation of the main information and/or subsidiary information is inhibited for signal component (maximum amplitude) s of a band approximately higher than the signal pass band .

EP0691751A1
CLAIM 24
The method as claimed in any one of claims 1 to 25 wherein , when determining the time length (pitch period) of the processing block using changes in the input signal of the processing block under consideration , the boundary value is variable dependent on the amplitude and frequency of the input signal .

EP0691751A1
CLAIM 32
An apparatus for information compaction comprising    block dividing means for dividing input signals with at least two channels into processing blocks , with the processing block length being varied depending on the input signals of the respective channels and with the lengths of the concurrent processing blocks of the respective channels being the same , and    means for compacting the information for processing block based signal (maximum amplitude) s .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (signal component, based signal) within a pitch period (time length) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
EP0691751A1
CLAIM 23
The method as claimed in claim 22 wherein allocation of the main information and/or subsidiary information is inhibited for signal component (maximum amplitude) s of a band approximately higher than the signal pass band .

EP0691751A1
CLAIM 24
The method as claimed in any one of claims 1 to 25 wherein , when determining the time length (pitch period) of the processing block using changes in the input signal of the processing block under consideration , the boundary value is variable dependent on the amplitude and frequency of the input signal .

EP0691751A1
CLAIM 32
An apparatus for information compaction comprising    block dividing means for dividing input signals with at least two channels into processing blocks , with the processing block length being varied depending on the input signals of the respective channels and with the lengths of the concurrent processing blocks of the respective channels being the same , and    means for compacting the information for processing block based signal (maximum amplitude) s .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5699477A

Filed: 1994-11-09     Issued: 1997-12-16

Mixed excitation linear prediction with fractional pitch

(Original Assignee) Texas Instruments Inc     (Current Assignee) Texas Instruments Inc

Alan V. McCree
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (input sound) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (input sound) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sounds at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficient (comfort noise) s for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (input sound) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (input sound) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (input sound) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sounds at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficient (comfort noise) s for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (input sound) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (input sound) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5699477A
CLAIM 1
. A method of encoding sounds , comprising the steps of : (a) providing frames of input sound (pitch period, signal classification parameter) s at a first sampling rate having a first sampling period ;
(b) determining linear prediction coefficients for a frame ;
(c) determining a pitch period for said frame ;
(d) determining correlation strengths for each of N frequency bands of said frame with N an integer greater than 1 ;
and (e) wherein said determining a pitch period of step (c) and determining correlation strengths of step (d) use pitch periods which include nonintegral multiples of said sampling period for a plurality of said frames .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5717818A

Filed: 1994-09-09     Issued: 1998-02-10

Audio signal storing apparatus having a function for converting speech speed

(Original Assignee) Hitachi Ltd     (Current Assignee) Hitachi Ltd

Yoshito Nejime, Yukio Kumagai, Tadashi Takamiya, Yasunori Kawauchi, Nobuo Hataoka, Juichi Morikawa
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (time length) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe (output means) affected by the artificial construction of the periodic part .
US5717818A
CLAIM 1
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a memory for storing said audio signal ;
an audio signal processing means for executing one of the following processing modes : a through mode for storing the audio signal into said memory while the audio signal is read out from said memory without changing the audio signal speed , a repeat mode for reading out the audio signal stored in said memory in the past and for outputting the past audio signal read out from said memory without changing the audio signal speed , and a speech speed changing mode for executing a speech speed changing process for the audio signal read out from said memory ;
switch means for causing said audio signal processing means to execute as said speech speed changing process one of said three processing modes of said audio signal processing means ;
and output means (last subframe, last frame) for outputting an output of said audio signal processing means as an audio signal .

US5717818A
CLAIM 26
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a ring buffer for storing said audio signal ;
an audio signal processing means for executing one of the following processing modes : a through mode for storing audio signal into said ring buffer while the audio signal is read out from said ring buffer and for outputting the audio signal read our from said ring buffer without changing the audio signal speed , a repeat mode for reading out the audio signal stored in said ring buffer in the past and for outputting the past audio signal read our from said ring buffer without changing the audio signal speed , and a speech speed changing mode for executing a speech speed changing process for the audio signal read out from said ring buffer ;
wherein , in said speech speed changing mode , average power of said audio signal is calculated in each input frame unit , said speech speed changing process being executed only in a case where said average power is higher than a predetermined threshold value , and the audio signal of frame unit being directly outputted in a case where said average power is lower than said predetermined threshold value , said speech speed changing mode being executed as a pipeline process by each frame unit with use of a plurality of input frame buffers in such a manner that for data of every frame a pitch extraction process is applied to a leading portion of the frame to detect a pitch of the leading portion , data of a length of one pitch thus detected is transferred to output buffers , data of a length of two pitches is multiplied by a window function which changes from 0 to 1 and by a window function which changes from 1 to 0 , respective data obtained by the multiplications by the window functions are added together to thereby generate a reproduced wave pattern having a time length (pitch period) of two pitches , the reproduced wave pattern being inserted in the rear of the preliminary transferred data of the length of one pitch , a pitch detection process is again carried out while spearheaded by a position preliminary subjected to the pitch extraction process to thereby perform pitch detection at said position , and data of the length of n pitches (n is an integer) based on the pitch length obtained by the final pitch detection to the output buffers ;
a switch for causing said audio signal processing means to execute as said speech speed changing process one of said processing modes of the audio signal processing means ;
and output means for outputting an output of said audio signal processing means as an audio signal .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (time length) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5717818A
CLAIM 26
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a ring buffer for storing said audio signal ;
an audio signal processing means for executing one of the following processing modes : a through mode for storing audio signal into said ring buffer while the audio signal is read out from said ring buffer and for outputting the audio signal read our from said ring buffer without changing the audio signal speed , a repeat mode for reading out the audio signal stored in said ring buffer in the past and for outputting the past audio signal read our from said ring buffer without changing the audio signal speed , and a speech speed changing mode for executing a speech speed changing process for the audio signal read out from said ring buffer ;
wherein , in said speech speed changing mode , average power of said audio signal is calculated in each input frame unit , said speech speed changing process being executed only in a case where said average power is higher than a predetermined threshold value , and the audio signal of frame unit being directly outputted in a case where said average power is lower than said predetermined threshold value , said speech speed changing mode being executed as a pipeline process by each frame unit with use of a plurality of input frame buffers in such a manner that for data of every frame a pitch extraction process is applied to a leading portion of the frame to detect a pitch of the leading portion , data of a length of one pitch thus detected is transferred to output buffers , data of a length of two pitches is multiplied by a window function which changes from 0 to 1 and by a window function which changes from 1 to 0 , respective data obtained by the multiplications by the window functions are added together to thereby generate a reproduced wave pattern having a time length (pitch period) of two pitches , the reproduced wave pattern being inserted in the rear of the preliminary transferred data of the length of one pitch , a pitch detection process is again carried out while spearheaded by a position preliminary subjected to the pitch extraction process to thereby perform pitch detection at said position , and data of the length of n pitches (n is an integer) based on the pitch length obtained by the final pitch detection to the output buffers ;
a switch for causing said audio signal processing means to execute as said speech speed changing process one of said processing modes of the audio signal processing means ;
and output means for outputting an output of said audio signal processing means as an audio signal .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5717818A
CLAIM 24
. An audio signal storing apparatus having a function for converting speech speed , comprising : a microphone for converting an acoustic signal into an electric signal ;
a first analog amplifier for amplifying an output of said microphone ;
a first low-pass filter for removing high-frequency components from an output of said analog amplifier ;
an A/D converter for converting an output of said analog amplifier into a digital signal ;
a memory means for storing input speech data and data obtained as a result of signal processing ;
a digital signal processor for reading out data from said memory means and for carrying out digital signal processing to execute a speech speed changing process for the acoustic signal in accordance with an external speech speed conversion instruction ;
means for controlling said speech speed changing process executed by said digital signal processor ;
means for changing a parameter of said speech speed changing process ;
selecting means for receiving said speech speed conversion instruction and causing said digital signal processor to execute as said speech speed changing process one of the following processings : a first processing for storing input speech data into said memory means while the speech data are read out from said memory means , and for outputting the speech data read out from said memory means without changing the speech speed , a second processing for reading out the speech data stored in said memory means in the past and for outputting the past speech data read out from said memory means without changing the speech speed , and a third processing for executing said speech speed changing process for the speech data read out from said memory means ;
a D/A converter for converting digital speech data outputted from said digital signal processor into an analog speech signal (speech signal, decoder determines concealment) ;
a second low-pass filter for removing high-frequency components from outputs of said D/A converter ;
a second analog amplifier for amplifying an output of said second low-pass filter ;
and a head-phone for converting an output of said second analog amplifier into an acoustic signal and for supplying the acoustic signal to ears of a user .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (time t) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (output means) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5717818A
CLAIM 1
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a memory for storing said audio signal ;
an audio signal processing means for executing one of the following processing modes : a through mode for storing the audio signal into said memory while the audio signal is read out from said memory without changing the audio signal speed , a repeat mode for reading out the audio signal stored in said memory in the past and for outputting the past audio signal read out from said memory without changing the audio signal speed , and a speech speed changing mode for executing a speech speed changing process for the audio signal read out from said memory ;
switch means for causing said audio signal processing means to execute as said speech speed changing process one of said three processing modes of said audio signal processing means ;
and output means (last subframe, last frame) for outputting an output of said audio signal processing means as an audio signal .

US5717818A
CLAIM 29
. An apparatus according to claim 26 , wherein a second threshold value is provided in the comparison process for comparing said average power with the predetermined threshold value so that when a frame having lower average power than said second threshold value is continued for a longer time t (first non) han a predetermined time threshold , data in the frame having lower average power than the second threshold value and continued for a longer time than said time threshold are forbidden to be transferred to the output buffers .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (time t) erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5717818A
CLAIM 24
. An audio signal storing apparatus having a function for converting speech speed , comprising : a microphone for converting an acoustic signal into an electric signal ;
a first analog amplifier for amplifying an output of said microphone ;
a first low-pass filter for removing high-frequency components from an output of said analog amplifier ;
an A/D converter for converting an output of said analog amplifier into a digital signal ;
a memory means for storing input speech data and data obtained as a result of signal processing ;
a digital signal processor for reading out data from said memory means and for carrying out digital signal processing to execute a speech speed changing process for the acoustic signal in accordance with an external speech speed conversion instruction ;
means for controlling said speech speed changing process executed by said digital signal processor ;
means for changing a parameter of said speech speed changing process ;
selecting means for receiving said speech speed conversion instruction and causing said digital signal processor to execute as said speech speed changing process one of the following processings : a first processing for storing input speech data into said memory means while the speech data are read out from said memory means , and for outputting the speech data read out from said memory means without changing the speech speed , a second processing for reading out the speech data stored in said memory means in the past and for outputting the past speech data read out from said memory means without changing the speech speed , and a third processing for executing said speech speed changing process for the speech data read out from said memory means ;
a D/A converter for converting digital speech data outputted from said digital signal processor into an analog speech signal (speech signal, decoder determines concealment) ;
a second low-pass filter for removing high-frequency components from outputs of said D/A converter ;
a second analog amplifier for amplifying an output of said second low-pass filter ;
and a head-phone for converting an output of said second analog amplifier into an acoustic signal and for supplying the acoustic signal to ears of a user .

US5717818A
CLAIM 29
. An apparatus according to claim 26 , wherein a second threshold value is provided in the comparison process for comparing said average power with the predetermined threshold value so that when a frame having lower average power than said second threshold value is continued for a longer time t (first non) han a predetermined time threshold , data in the frame having lower average power than the second threshold value and continued for a longer time than said time threshold are forbidden to be transferred to the output buffers .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non (time t) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5717818A
CLAIM 24
. An audio signal storing apparatus having a function for converting speech speed , comprising : a microphone for converting an acoustic signal into an electric signal ;
a first analog amplifier for amplifying an output of said microphone ;
a first low-pass filter for removing high-frequency components from an output of said analog amplifier ;
an A/D converter for converting an output of said analog amplifier into a digital signal ;
a memory means for storing input speech data and data obtained as a result of signal processing ;
a digital signal processor for reading out data from said memory means and for carrying out digital signal processing to execute a speech speed changing process for the acoustic signal in accordance with an external speech speed conversion instruction ;
means for controlling said speech speed changing process executed by said digital signal processor ;
means for changing a parameter of said speech speed changing process ;
selecting means for receiving said speech speed conversion instruction and causing said digital signal processor to execute as said speech speed changing process one of the following processings : a first processing for storing input speech data into said memory means while the speech data are read out from said memory means , and for outputting the speech data read out from said memory means without changing the speech speed , a second processing for reading out the speech data stored in said memory means in the past and for outputting the past speech data read out from said memory means without changing the speech speed , and a third processing for executing said speech speed changing process for the speech data read out from said memory means ;
a D/A converter for converting digital speech data outputted from said digital signal processor into an analog speech signal (speech signal, decoder determines concealment) ;
a second low-pass filter for removing high-frequency components from outputs of said D/A converter ;
a second analog amplifier for amplifying an output of said second low-pass filter ;
and a head-phone for converting an output of said second analog amplifier into an acoustic signal and for supplying the acoustic signal to ears of a user .

US5717818A
CLAIM 29
. An apparatus according to claim 26 , wherein a second threshold value is provided in the comparison process for comparing said average power with the predetermined threshold value so that when a frame having lower average power than said second threshold value is continued for a longer time t (first non) han a predetermined time threshold , data in the frame having lower average power than the second threshold value and continued for a longer time than said time threshold are forbidden to be transferred to the output buffers .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (output means) erased during said frame erasure , adjusting an energy of an LP filter excitation signal (input terminal) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5717818A
CLAIM 1
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a memory for storing said audio signal ;
an audio signal processing means for executing one of the following processing modes : a through mode for storing the audio signal into said memory while the audio signal is read out from said memory without changing the audio signal speed , a repeat mode for reading out the audio signal stored in said memory in the past and for outputting the past audio signal read out from said memory without changing the audio signal speed , and a speech speed changing mode for executing a speech speed changing process for the audio signal read out from said memory ;
switch means for causing said audio signal processing means to execute as said speech speed changing process one of said three processing modes of said audio signal processing means ;
and output means (last subframe, last frame) for outputting an output of said audio signal processing means as an audio signal .

US5717818A
CLAIM 25
. An apparatus according to claim 24 , wherein said third processing is provided as a software executed by said digital signal processor having an input terminal (LP filter excitation signal) for receiving an interruption request signal from the outside , so that an instruction for selection of one mode among said first to third processings by said selecting means is provided to said digital signal processor via the input terminal receiving said interruption request .

US5717818A
CLAIM 29
. An apparatus according to claim 26 , wherein a second threshold value is provided in the comparison process for comparing said average power with the predetermined threshold value so that when a frame having lower average power than said second threshold value is continued for a longer time t (first non) han a predetermined time threshold , data in the frame having lower average power than the second threshold value and continued for a longer time than said time threshold are forbidden to be transferred to the output buffers .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (input terminal) produced in the decoder during the received first non (time t) erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (analog signal) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5717818A
CLAIM 25
. An apparatus according to claim 24 , wherein said third processing is provided as a software executed by said digital signal processor having an input terminal (LP filter excitation signal) for receiving an interruption request signal from the outside , so that an instruction for selection of one mode among said first to third processings by said selecting means is provided to said digital signal processor via the input terminal receiving said interruption request .

US5717818A
CLAIM 29
. An apparatus according to claim 26 , wherein a second threshold value is provided in the comparison process for comparing said average power with the predetermined threshold value so that when a frame having lower average power than said second threshold value is continued for a longer time t (first non) han a predetermined time threshold , data in the frame having lower average power than the second threshold value and continued for a longer time than said time threshold are forbidden to be transferred to the output buffers .

US5717818A
CLAIM 31
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a memory for digitizing said audio signal and storing the digitized audio signal in accordance with a writing pointer ;
an audio signal processing means for executing one of the following processing modes for the audio signal read out from a reading pointer position of said memory : a through mode for outputting a read-out audio signal from said memory in such a manner that a reading pointer position is same as a writing pointer position , a repeat mode for outputting the audio signal read out from said memory by setting the reading pointer position which is returned back by a predetermined time from the writing pointer , and a speech speed changing mode for outputting the audio signal by executing a speech speed changing process for the audio signal read out from said memory in such a manner that the reading pointer position is gradually delayed from the writing pointer ;
switch for causing said audio signal processing means to execute as said speech speed changing process one of said three processing modes of said audio signal processing means ;
and output means for converting the output of said audio signal processing means to an analog signal (current frame) and outputting said analog signal as an audio signal .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (time length) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5717818A
CLAIM 26
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a ring buffer for storing said audio signal ;
an audio signal processing means for executing one of the following processing modes : a through mode for storing audio signal into said ring buffer while the audio signal is read out from said ring buffer and for outputting the audio signal read our from said ring buffer without changing the audio signal speed , a repeat mode for reading out the audio signal stored in said ring buffer in the past and for outputting the past audio signal read our from said ring buffer without changing the audio signal speed , and a speech speed changing mode for executing a speech speed changing process for the audio signal read out from said ring buffer ;
wherein , in said speech speed changing mode , average power of said audio signal is calculated in each input frame unit , said speech speed changing process being executed only in a case where said average power is higher than a predetermined threshold value , and the audio signal of frame unit being directly outputted in a case where said average power is lower than said predetermined threshold value , said speech speed changing mode being executed as a pipeline process by each frame unit with use of a plurality of input frame buffers in such a manner that for data of every frame a pitch extraction process is applied to a leading portion of the frame to detect a pitch of the leading portion , data of a length of one pitch thus detected is transferred to output buffers , data of a length of two pitches is multiplied by a window function which changes from 0 to 1 and by a window function which changes from 1 to 0 , respective data obtained by the multiplications by the window functions are added together to thereby generate a reproduced wave pattern having a time length (pitch period) of two pitches , the reproduced wave pattern being inserted in the rear of the preliminary transferred data of the length of one pitch , a pitch detection process is again carried out while spearheaded by a position preliminary subjected to the pitch extraction process to thereby perform pitch detection at said position , and data of the length of n pitches (n is an integer) based on the pitch length obtained by the final pitch detection to the output buffers ;
a switch for causing said audio signal processing means to execute as said speech speed changing process one of said processing modes of the audio signal processing means ;
and output means for outputting an output of said audio signal processing means as an audio signal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (time lag) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (output means) erased during said frame erasure , adjusting an energy of an LP filter excitation signal (input terminal) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (analog signal) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5717818A
CLAIM 1
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a memory for storing said audio signal ;
an audio signal processing means for executing one of the following processing modes : a through mode for storing the audio signal into said memory while the audio signal is read out from said memory without changing the audio signal speed , a repeat mode for reading out the audio signal stored in said memory in the past and for outputting the past audio signal read out from said memory without changing the audio signal speed , and a speech speed changing mode for executing a speech speed changing process for the audio signal read out from said memory ;
switch means for causing said audio signal processing means to execute as said speech speed changing process one of said three processing modes of said audio signal processing means ;
and output means (last subframe, last frame) for outputting an output of said audio signal processing means as an audio signal .

US5717818A
CLAIM 5
. An apparatus according to claim 1 , further comprising : a catch-up mode for adjusting a quantity of a time lag (replacement frame) from the input audio signal in real time so as to catch up with the real time audio signal when the time lag is caused by the repeat mode or the speech speed changing mode .

US5717818A
CLAIM 25
. An apparatus according to claim 24 , wherein said third processing is provided as a software executed by said digital signal processor having an input terminal (LP filter excitation signal) for receiving an interruption request signal from the outside , so that an instruction for selection of one mode among said first to third processings by said selecting means is provided to said digital signal processor via the input terminal receiving said interruption request .

US5717818A
CLAIM 29
. An apparatus according to claim 26 , wherein a second threshold value is provided in the comparison process for comparing said average power with the predetermined threshold value so that when a frame having lower average power than said second threshold value is continued for a longer time t (first non) han a predetermined time threshold , data in the frame having lower average power than the second threshold value and continued for a longer time than said time threshold are forbidden to be transferred to the output buffers .

US5717818A
CLAIM 31
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a memory for digitizing said audio signal and storing the digitized audio signal in accordance with a writing pointer ;
an audio signal processing means for executing one of the following processing modes for the audio signal read out from a reading pointer position of said memory : a through mode for outputting a read-out audio signal from said memory in such a manner that a reading pointer position is same as a writing pointer position , a repeat mode for outputting the audio signal read out from said memory by setting the reading pointer position which is returned back by a predetermined time from the writing pointer , and a speech speed changing mode for outputting the audio signal by executing a speech speed changing process for the audio signal read out from said memory in such a manner that the reading pointer position is gradually delayed from the writing pointer ;
switch for causing said audio signal processing means to execute as said speech speed changing process one of said three processing modes of said audio signal processing means ;
and output means for converting the output of said audio signal processing means to an analog signal (current frame) and outputting said analog signal as an audio signal .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (time length) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe (output means) affected by the artificial construction of the periodic part .
US5717818A
CLAIM 1
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a memory for storing said audio signal ;
an audio signal processing means for executing one of the following processing modes : a through mode for storing the audio signal into said memory while the audio signal is read out from said memory without changing the audio signal speed , a repeat mode for reading out the audio signal stored in said memory in the past and for outputting the past audio signal read out from said memory without changing the audio signal speed , and a speech speed changing mode for executing a speech speed changing process for the audio signal read out from said memory ;
switch means for causing said audio signal processing means to execute as said speech speed changing process one of said three processing modes of said audio signal processing means ;
and output means (last subframe, last frame) for outputting an output of said audio signal processing means as an audio signal .

US5717818A
CLAIM 26
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a ring buffer for storing said audio signal ;
an audio signal processing means for executing one of the following processing modes : a through mode for storing audio signal into said ring buffer while the audio signal is read out from said ring buffer and for outputting the audio signal read our from said ring buffer without changing the audio signal speed , a repeat mode for reading out the audio signal stored in said ring buffer in the past and for outputting the past audio signal read our from said ring buffer without changing the audio signal speed , and a speech speed changing mode for executing a speech speed changing process for the audio signal read out from said ring buffer ;
wherein , in said speech speed changing mode , average power of said audio signal is calculated in each input frame unit , said speech speed changing process being executed only in a case where said average power is higher than a predetermined threshold value , and the audio signal of frame unit being directly outputted in a case where said average power is lower than said predetermined threshold value , said speech speed changing mode being executed as a pipeline process by each frame unit with use of a plurality of input frame buffers in such a manner that for data of every frame a pitch extraction process is applied to a leading portion of the frame to detect a pitch of the leading portion , data of a length of one pitch thus detected is transferred to output buffers , data of a length of two pitches is multiplied by a window function which changes from 0 to 1 and by a window function which changes from 1 to 0 , respective data obtained by the multiplications by the window functions are added together to thereby generate a reproduced wave pattern having a time length (pitch period) of two pitches , the reproduced wave pattern being inserted in the rear of the preliminary transferred data of the length of one pitch , a pitch detection process is again carried out while spearheaded by a position preliminary subjected to the pitch extraction process to thereby perform pitch detection at said position , and data of the length of n pitches (n is an integer) based on the pitch length obtained by the final pitch detection to the output buffers ;
a switch for causing said audio signal processing means to execute as said speech speed changing process one of said processing modes of the audio signal processing means ;
and output means for outputting an output of said audio signal processing means as an audio signal .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (time length) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5717818A
CLAIM 30
. A speech speed conversion apparatus for receiving an input speech and for changing output time without changing a pitch of said input speech , comprising : a portable case having a speech input microphone disposed on a surface of the case , an output control switch , and a speech output terminal , said case being as small as a palm ;
said case includes : compressing means coupled to said input microphone for digitally compressing said input speech in each predetermined length , said compressing of the input speech being deleting of data bit , wherein a speech power is smaller than a first threshold value and the period is longer than a predetermined time length (pitch period) , a memory for storing the compressed speech data in an order of time series , decompressing means for reading out the compressed speech data of the predetermined length from said memory and for releasing said compressed speech data from the compression , speech speed converting means for outputting the decoded speech data in response to an instruction from said output control switch and for converting the speed of said speech data in such a manner that the decompressed speech data , in a period where the power of speech data is larger than a second threshold value , is maintained in constant pitch and the time axis is extended longer than the input time of said period ;
and output means for supplying an output from said speech speed converting means to said speech output terminal .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5717818A
CLAIM 24
. An audio signal storing apparatus having a function for converting speech speed , comprising : a microphone for converting an acoustic signal into an electric signal ;
a first analog amplifier for amplifying an output of said microphone ;
a first low-pass filter for removing high-frequency components from an output of said analog amplifier ;
an A/D converter for converting an output of said analog amplifier into a digital signal ;
a memory means for storing input speech data and data obtained as a result of signal processing ;
a digital signal processor for reading out data from said memory means and for carrying out digital signal processing to execute a speech speed changing process for the acoustic signal in accordance with an external speech speed conversion instruction ;
means for controlling said speech speed changing process executed by said digital signal processor ;
means for changing a parameter of said speech speed changing process ;
selecting means for receiving said speech speed conversion instruction and causing said digital signal processor to execute as said speech speed changing process one of the following processings : a first processing for storing input speech data into said memory means while the speech data are read out from said memory means , and for outputting the speech data read out from said memory means without changing the speech speed , a second processing for reading out the speech data stored in said memory means in the past and for outputting the past speech data read out from said memory means without changing the speech speed , and a third processing for executing said speech speed changing process for the speech data read out from said memory means ;
a D/A converter for converting digital speech data outputted from said digital signal processor into an analog speech signal (speech signal, decoder determines concealment) ;
a second low-pass filter for removing high-frequency components from outputs of said D/A converter ;
a second analog amplifier for amplifying an output of said second low-pass filter ;
and a head-phone for converting an output of said second analog amplifier into an acoustic signal and for supplying the acoustic signal to ears of a user .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non (time t) erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (output means) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5717818A
CLAIM 1
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a memory for storing said audio signal ;
an audio signal processing means for executing one of the following processing modes : a through mode for storing the audio signal into said memory while the audio signal is read out from said memory without changing the audio signal speed , a repeat mode for reading out the audio signal stored in said memory in the past and for outputting the past audio signal read out from said memory without changing the audio signal speed , and a speech speed changing mode for executing a speech speed changing process for the audio signal read out from said memory ;
switch means for causing said audio signal processing means to execute as said speech speed changing process one of said three processing modes of said audio signal processing means ;
and output means (last subframe, last frame) for outputting an output of said audio signal processing means as an audio signal .

US5717818A
CLAIM 29
. An apparatus according to claim 26 , wherein a second threshold value is provided in the comparison process for comparing said average power with the predetermined threshold value so that when a frame having lower average power than said second threshold value is continued for a longer time t (first non) han a predetermined time threshold , data in the frame having lower average power than the second threshold value and continued for a longer time than said time threshold are forbidden to be transferred to the output buffers .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non (time t) erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5717818A
CLAIM 24
. An audio signal storing apparatus having a function for converting speech speed , comprising : a microphone for converting an acoustic signal into an electric signal ;
a first analog amplifier for amplifying an output of said microphone ;
a first low-pass filter for removing high-frequency components from an output of said analog amplifier ;
an A/D converter for converting an output of said analog amplifier into a digital signal ;
a memory means for storing input speech data and data obtained as a result of signal processing ;
a digital signal processor for reading out data from said memory means and for carrying out digital signal processing to execute a speech speed changing process for the acoustic signal in accordance with an external speech speed conversion instruction ;
means for controlling said speech speed changing process executed by said digital signal processor ;
means for changing a parameter of said speech speed changing process ;
selecting means for receiving said speech speed conversion instruction and causing said digital signal processor to execute as said speech speed changing process one of the following processings : a first processing for storing input speech data into said memory means while the speech data are read out from said memory means , and for outputting the speech data read out from said memory means without changing the speech speed , a second processing for reading out the speech data stored in said memory means in the past and for outputting the past speech data read out from said memory means without changing the speech speed , and a third processing for executing said speech speed changing process for the speech data read out from said memory means ;
a D/A converter for converting digital speech data outputted from said digital signal processor into an analog speech signal (speech signal, decoder determines concealment) ;
a second low-pass filter for removing high-frequency components from outputs of said D/A converter ;
a second analog amplifier for amplifying an output of said second low-pass filter ;
and a head-phone for converting an output of said second analog amplifier into an acoustic signal and for supplying the acoustic signal to ears of a user .

US5717818A
CLAIM 29
. An apparatus according to claim 26 , wherein a second threshold value is provided in the comparison process for comparing said average power with the predetermined threshold value so that when a frame having lower average power than said second threshold value is continued for a longer time t (first non) han a predetermined time threshold , data in the frame having lower average power than the second threshold value and continued for a longer time than said time threshold are forbidden to be transferred to the output buffers .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non (time t) erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5717818A
CLAIM 24
. An audio signal storing apparatus having a function for converting speech speed , comprising : a microphone for converting an acoustic signal into an electric signal ;
a first analog amplifier for amplifying an output of said microphone ;
a first low-pass filter for removing high-frequency components from an output of said analog amplifier ;
an A/D converter for converting an output of said analog amplifier into a digital signal ;
a memory means for storing input speech data and data obtained as a result of signal processing ;
a digital signal processor for reading out data from said memory means and for carrying out digital signal processing to execute a speech speed changing process for the acoustic signal in accordance with an external speech speed conversion instruction ;
means for controlling said speech speed changing process executed by said digital signal processor ;
means for changing a parameter of said speech speed changing process ;
selecting means for receiving said speech speed conversion instruction and causing said digital signal processor to execute as said speech speed changing process one of the following processings : a first processing for storing input speech data into said memory means while the speech data are read out from said memory means , and for outputting the speech data read out from said memory means without changing the speech speed , a second processing for reading out the speech data stored in said memory means in the past and for outputting the past speech data read out from said memory means without changing the speech speed , and a third processing for executing said speech speed changing process for the speech data read out from said memory means ;
a D/A converter for converting digital speech data outputted from said digital signal processor into an analog speech signal (speech signal, decoder determines concealment) ;
a second low-pass filter for removing high-frequency components from outputs of said D/A converter ;
a second analog amplifier for amplifying an output of said second low-pass filter ;
and a head-phone for converting an output of said second analog amplifier into an acoustic signal and for supplying the acoustic signal to ears of a user .

US5717818A
CLAIM 29
. An apparatus according to claim 26 , wherein a second threshold value is provided in the comparison process for comparing said average power with the predetermined threshold value so that when a frame having lower average power than said second threshold value is continued for a longer time t (first non) han a predetermined time threshold , data in the frame having lower average power than the second threshold value and continued for a longer time than said time threshold are forbidden to be transferred to the output buffers .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (output means) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (input terminal) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5717818A
CLAIM 1
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a memory for storing said audio signal ;
an audio signal processing means for executing one of the following processing modes : a through mode for storing the audio signal into said memory while the audio signal is read out from said memory without changing the audio signal speed , a repeat mode for reading out the audio signal stored in said memory in the past and for outputting the past audio signal read out from said memory without changing the audio signal speed , and a speech speed changing mode for executing a speech speed changing process for the audio signal read out from said memory ;
switch means for causing said audio signal processing means to execute as said speech speed changing process one of said three processing modes of said audio signal processing means ;
and output means (last subframe, last frame) for outputting an output of said audio signal processing means as an audio signal .

US5717818A
CLAIM 25
. An apparatus according to claim 24 , wherein said third processing is provided as a software executed by said digital signal processor having an input terminal (LP filter excitation signal) for receiving an interruption request signal from the outside , so that an instruction for selection of one mode among said first to third processings by said selecting means is provided to said digital signal processor via the input terminal receiving said interruption request .

US5717818A
CLAIM 29
. An apparatus according to claim 26 , wherein a second threshold value is provided in the comparison process for comparing said average power with the predetermined threshold value so that when a frame having lower average power than said second threshold value is continued for a longer time t (first non) han a predetermined time threshold , data in the frame having lower average power than the second threshold value and continued for a longer time than said time threshold are forbidden to be transferred to the output buffers .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (input terminal) produced in the decoder during the received first non (time t) erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (analog signal) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5717818A
CLAIM 25
. An apparatus according to claim 24 , wherein said third processing is provided as a software executed by said digital signal processor having an input terminal (LP filter excitation signal) for receiving an interruption request signal from the outside , so that an instruction for selection of one mode among said first to third processings by said selecting means is provided to said digital signal processor via the input terminal receiving said interruption request .

US5717818A
CLAIM 29
. An apparatus according to claim 26 , wherein a second threshold value is provided in the comparison process for comparing said average power with the predetermined threshold value so that when a frame having lower average power than said second threshold value is continued for a longer time t (first non) han a predetermined time threshold , data in the frame having lower average power than the second threshold value and continued for a longer time than said time threshold are forbidden to be transferred to the output buffers .

US5717818A
CLAIM 31
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a memory for digitizing said audio signal and storing the digitized audio signal in accordance with a writing pointer ;
an audio signal processing means for executing one of the following processing modes for the audio signal read out from a reading pointer position of said memory : a through mode for outputting a read-out audio signal from said memory in such a manner that a reading pointer position is same as a writing pointer position , a repeat mode for outputting the audio signal read out from said memory by setting the reading pointer position which is returned back by a predetermined time from the writing pointer , and a speech speed changing mode for outputting the audio signal by executing a speech speed changing process for the audio signal read out from said memory in such a manner that the reading pointer position is gradually delayed from the writing pointer ;
switch for causing said audio signal processing means to execute as said speech speed changing process one of said three processing modes of said audio signal processing means ;
and output means for converting the output of said audio signal processing means to an analog signal (current frame) and outputting said analog signal as an audio signal .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (time length) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5717818A
CLAIM 30
. A speech speed conversion apparatus for receiving an input speech and for changing output time without changing a pitch of said input speech , comprising : a portable case having a speech input microphone disposed on a surface of the case , an output control switch , and a speech output terminal , said case being as small as a palm ;
said case includes : compressing means coupled to said input microphone for digitally compressing said input speech in each predetermined length , said compressing of the input speech being deleting of data bit , wherein a speech power is smaller than a first threshold value and the period is longer than a predetermined time length (pitch period) , a memory for storing the compressed speech data in an order of time series , decompressing means for reading out the compressed speech data of the predetermined length from said memory and for releasing said compressed speech data from the compression , speech speed converting means for outputting the decoded speech data in response to an instruction from said output control switch and for converting the speed of said speech data in such a manner that the decompressed speech data , in a period where the power of speech data is larger than a second threshold value , is maintained in constant pitch and the time axis is extended longer than the input time of said period ;
and output means for supplying an output from said speech speed converting means to said speech output terminal .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5717818A
CLAIM 24
. An audio signal storing apparatus having a function for converting speech speed , comprising : a microphone for converting an acoustic signal into an electric signal ;
a first analog amplifier for amplifying an output of said microphone ;
a first low-pass filter for removing high-frequency components from an output of said analog amplifier ;
an A/D converter for converting an output of said analog amplifier into a digital signal ;
a memory means for storing input speech data and data obtained as a result of signal processing ;
a digital signal processor for reading out data from said memory means and for carrying out digital signal processing to execute a speech speed changing process for the acoustic signal in accordance with an external speech speed conversion instruction ;
means for controlling said speech speed changing process executed by said digital signal processor ;
means for changing a parameter of said speech speed changing process ;
selecting means for receiving said speech speed conversion instruction and causing said digital signal processor to execute as said speech speed changing process one of the following processings : a first processing for storing input speech data into said memory means while the speech data are read out from said memory means , and for outputting the speech data read out from said memory means without changing the speech speed , a second processing for reading out the speech data stored in said memory means in the past and for outputting the past speech data read out from said memory means without changing the speech speed , and a third processing for executing said speech speed changing process for the speech data read out from said memory means ;
a D/A converter for converting digital speech data outputted from said digital signal processor into an analog speech signal (speech signal, decoder determines concealment) ;
a second low-pass filter for removing high-frequency components from outputs of said D/A converter ;
a second analog amplifier for amplifying an output of said second low-pass filter ;
and a head-phone for converting an output of said second analog amplifier into an acoustic signal and for supplying the acoustic signal to ears of a user .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (time lag) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non (time t) erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (output means) erased during said frame erasure , adjusts an energy of an LP filter excitation signal (input terminal) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (analog signal) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5717818A
CLAIM 1
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a memory for storing said audio signal ;
an audio signal processing means for executing one of the following processing modes : a through mode for storing the audio signal into said memory while the audio signal is read out from said memory without changing the audio signal speed , a repeat mode for reading out the audio signal stored in said memory in the past and for outputting the past audio signal read out from said memory without changing the audio signal speed , and a speech speed changing mode for executing a speech speed changing process for the audio signal read out from said memory ;
switch means for causing said audio signal processing means to execute as said speech speed changing process one of said three processing modes of said audio signal processing means ;
and output means (last subframe, last frame) for outputting an output of said audio signal processing means as an audio signal .

US5717818A
CLAIM 5
. An apparatus according to claim 1 , further comprising : a catch-up mode for adjusting a quantity of a time lag (replacement frame) from the input audio signal in real time so as to catch up with the real time audio signal when the time lag is caused by the repeat mode or the speech speed changing mode .

US5717818A
CLAIM 25
. An apparatus according to claim 24 , wherein said third processing is provided as a software executed by said digital signal processor having an input terminal (LP filter excitation signal) for receiving an interruption request signal from the outside , so that an instruction for selection of one mode among said first to third processings by said selecting means is provided to said digital signal processor via the input terminal receiving said interruption request .

US5717818A
CLAIM 29
. An apparatus according to claim 26 , wherein a second threshold value is provided in the comparison process for comparing said average power with the predetermined threshold value so that when a frame having lower average power than said second threshold value is continued for a longer time t (first non) han a predetermined time threshold , data in the frame having lower average power than the second threshold value and continued for a longer time than said time threshold are forbidden to be transferred to the output buffers .

US5717818A
CLAIM 31
. An audio signal storing apparatus having a function for converting speech speed , comprising : audio signal input means for taking in an audio signal ;
a memory for digitizing said audio signal and storing the digitized audio signal in accordance with a writing pointer ;
an audio signal processing means for executing one of the following processing modes for the audio signal read out from a reading pointer position of said memory : a through mode for outputting a read-out audio signal from said memory in such a manner that a reading pointer position is same as a writing pointer position , a repeat mode for outputting the audio signal read out from said memory by setting the reading pointer position which is returned back by a predetermined time from the writing pointer , and a speech speed changing mode for outputting the audio signal by executing a speech speed changing process for the audio signal read out from said memory in such a manner that the reading pointer position is gradually delayed from the writing pointer ;
switch for causing said audio signal processing means to execute as said speech speed changing process one of said three processing modes of said audio signal processing means ;
and output means for converting the output of said audio signal processing means to an analog signal (current frame) and outputting said analog signal as an audio signal .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5787387A

Filed: 1994-07-11     Issued: 1998-07-28

Harmonic adaptive speech coding method and system

(Original Assignee) Voxware Inc     (Current Assignee) Google LLC

Joseph Gerard Aguilar
US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter (boundary condition) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter (boundary condition) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter (boundary condition) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal, initial phase) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5787387A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and following the step of detecting the method further comprises the step of determining whether a segment represents voiced or unvoiced speech on the basis of the detected fundamental frequency .

US5787387A
CLAIM 19
. The method of claim 17 wherein the step of synthesizing a voiced speech comprises the steps of : determining the initial phase (speech signal, decoder determines concealment) offsets for each harmonic frequency ;
and synthesizing voiced speech using the encoded sequence of amplitudes of harmonic frequencies and the determined phase offsets .

US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter (boundary condition) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal, initial phase) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5787387A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and following the step of detecting the method further comprises the step of determining whether a segment represents voiced or unvoiced speech on the basis of the detected fundamental frequency .

US5787387A
CLAIM 19
. The method of claim 17 wherein the step of synthesizing a voiced speech comprises the steps of : determining the initial phase (speech signal, decoder determines concealment) offsets for each harmonic frequency ;
and synthesizing voiced speech using the encoded sequence of amplitudes of harmonic frequencies and the determined phase offsets .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal, initial phase) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5787387A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and following the step of detecting the method further comprises the step of determining whether a segment represents voiced or unvoiced speech on the basis of the detected fundamental frequency .

US5787387A
CLAIM 19
. The method of claim 17 wherein the step of synthesizing a voiced speech comprises the steps of : determining the initial phase (speech signal, decoder determines concealment) offsets for each harmonic frequency ;
and synthesizing voiced speech using the encoded sequence of amplitudes of harmonic frequencies and the determined phase offsets .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter (boundary condition) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (synthesized signal, represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5787387A
CLAIM 17
. The method of claim 16 wherein the audio signals being synthesized are speech signals and wherein following the step of detecting the method further comprises the steps of : determining whether a data packet represents a (LP filter excitation signal) voiced or unvoiced speech segment on the basis of the detected fundamental frequency ;
synthesizing unvoiced speech in response to encoded information in a data packet determined to represent unvoiced speech ;
and providing amplitude and phase continuity on the boundary between adjacent synthesized speech segments .

US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US5787387A
CLAIM 24
. The method of claim 22 further comprising the step of generating sound effects by changing the length of the synthesized signal (LP filter excitation signal) segments .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal (synthesized signal, represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5787387A
CLAIM 17
. The method of claim 16 wherein the audio signals being synthesized are speech signals and wherein following the step of detecting the method further comprises the steps of : determining whether a data packet represents a (LP filter excitation signal) voiced or unvoiced speech segment on the basis of the detected fundamental frequency ;
synthesizing unvoiced speech in response to encoded information in a data packet determined to represent unvoiced speech ;
and providing amplitude and phase continuity on the boundary between adjacent synthesized speech segments .

US5787387A
CLAIM 24
. The method of claim 22 further comprising the step of generating sound effects by changing the length of the synthesized signal (LP filter excitation signal) segments .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal (synthesized signal, represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5787387A
CLAIM 17
. The method of claim 16 wherein the audio signals being synthesized are speech signals and wherein following the step of detecting the method further comprises the steps of : determining whether a data packet represents a (LP filter excitation signal) voiced or unvoiced speech segment on the basis of the detected fundamental frequency ;
synthesizing unvoiced speech in response to encoded information in a data packet determined to represent unvoiced speech ;
and providing amplitude and phase continuity on the boundary between adjacent synthesized speech segments .

US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US5787387A
CLAIM 24
. The method of claim 22 further comprising the step of generating sound effects by changing the length of the synthesized signal (LP filter excitation signal) segments .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal, initial phase) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5787387A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and following the step of detecting the method further comprises the step of determining whether a segment represents voiced or unvoiced speech on the basis of the detected fundamental frequency .

US5787387A
CLAIM 19
. The method of claim 17 wherein the step of synthesizing a voiced speech comprises the steps of : determining the initial phase (speech signal, decoder determines concealment) offsets for each harmonic frequency ;
and synthesizing voiced speech using the encoded sequence of amplitudes of harmonic frequencies and the determined phase offsets .

US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal, initial phase) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5787387A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and following the step of detecting the method further comprises the step of determining whether a segment represents voiced or unvoiced speech on the basis of the detected fundamental frequency .

US5787387A
CLAIM 19
. The method of claim 17 wherein the step of synthesizing a voiced speech comprises the steps of : determining the initial phase (speech signal, decoder determines concealment) offsets for each harmonic frequency ;
and synthesizing voiced speech using the encoded sequence of amplitudes of harmonic frequencies and the determined phase offsets .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal, initial phase) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5787387A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and following the step of detecting the method further comprises the step of determining whether a segment represents voiced or unvoiced speech on the basis of the detected fundamental frequency .

US5787387A
CLAIM 19
. The method of claim 17 wherein the step of synthesizing a voiced speech comprises the steps of : determining the initial phase (speech signal, decoder determines concealment) offsets for each harmonic frequency ;
and synthesizing voiced speech using the encoded sequence of amplitudes of harmonic frequencies and the determined phase offsets .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal (synthesized signal, represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5787387A
CLAIM 17
. The method of claim 16 wherein the audio signals being synthesized are speech signals and wherein following the step of detecting the method further comprises the steps of : determining whether a data packet represents a (LP filter excitation signal) voiced or unvoiced speech segment on the basis of the detected fundamental frequency ;
synthesizing unvoiced speech in response to encoded information in a data packet determined to represent unvoiced speech ;
and providing amplitude and phase continuity on the boundary between adjacent synthesized speech segments .

US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US5787387A
CLAIM 24
. The method of claim 22 further comprising the step of generating sound effects by changing the length of the synthesized signal (LP filter excitation signal) segments .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal (synthesized signal, represents a) produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5787387A
CLAIM 17
. The method of claim 16 wherein the audio signals being synthesized are speech signals and wherein following the step of detecting the method further comprises the steps of : determining whether a data packet represents a (LP filter excitation signal) voiced or unvoiced speech segment on the basis of the detected fundamental frequency ;
synthesizing unvoiced speech in response to encoded information in a data packet determined to represent unvoiced speech ;
and providing amplitude and phase continuity on the boundary between adjacent synthesized speech segments .

US5787387A
CLAIM 24
. The method of claim 22 further comprising the step of generating sound effects by changing the length of the synthesized signal (LP filter excitation signal) segments .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal, initial phase) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5787387A
CLAIM 2
. The method of claim 1 wherein the audio signal is a speech signal (speech signal, decoder determines concealment) and following the step of detecting the method further comprises the step of determining whether a segment represents voiced or unvoiced speech on the basis of the detected fundamental frequency .

US5787387A
CLAIM 19
. The method of claim 17 wherein the step of synthesizing a voiced speech comprises the steps of : determining the initial phase (speech signal, decoder determines concealment) offsets for each harmonic frequency ;
and synthesizing voiced speech using the encoded sequence of amplitudes of harmonic frequencies and the determined phase offsets .

US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (boundary condition) related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal (synthesized signal, represents a) produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5787387A
CLAIM 17
. The method of claim 16 wherein the audio signals being synthesized are speech signals and wherein following the step of detecting the method further comprises the steps of : determining whether a data packet represents a (LP filter excitation signal) voiced or unvoiced speech segment on the basis of the detected fundamental frequency ;
synthesizing unvoiced speech in response to encoded information in a data packet determined to represent unvoiced speech ;
and providing amplitude and phase continuity on the boundary between adjacent synthesized speech segments .

US5787387A
CLAIM 21
. The method of claim 20 wherein phase continuity for each harmonic frequency in adjacent voiced segments is insured using the boundary condition (phase information parameter) : ξ(h)=(h+1)φ . sup . - (M)+ξ . sup . - (h) , where φ - (M) and ξ - (h) are the corresponding quantities of the previous segment .

US5787387A
CLAIM 24
. The method of claim 22 further comprising the step of generating sound effects by changing the length of the synthesized signal (LP filter excitation signal) segments .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5664051A

Filed: 1994-06-23     Issued: 1997-09-02

Method and apparatus for phase synthesis for speech processing

(Original Assignee) Digital Voice Systems Inc     (Current Assignee) Digital Voice Systems Inc

John C. Hardwick, Jae S. Lim
US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5664051A
CLAIM 1
. A speech decoder apparatus for synthesizing a speech signal (speech signal, decoder determines concealment) from a digitized speech bit stream of the type produced by processing speech with a speech encoder , said apparatus comprising an analyzer for processing said digitized speech bit stream to generate an angular frequency and magnitude for each of a plurality of sinusoidal voiced frequency components representing the speech processed by the speech encoder , said analyzer generating said angular frequencies and magnitudes over a sequence of times ;
a random signal generator for generating a time sequence of random phase components ;
a phase synthesizer for generating a time sequence of synthesized phases for at least some of said sinusoidal voiced frequency components , said synthesized phases being generated from said angular frequencies and random phase components ;
a first synthesizer for synthesizing the voiced frequency components of speech from said time sequences of angular frequencies , magnitudes , and synthesized phases ;
and a second synthesizer for synthesizing unvoiced frequency components representing the speech processed by the speech encoder , using a technique different from the technique used for synthesizing the voiced frequency components ;
wherein the speech signal is synthesized by combining synthesized voiced and unvoiced frequency components coexisting at the same time instants .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5664051A
CLAIM 1
. A speech decoder apparatus for synthesizing a speech signal from a digitized speech bit stream of the type produced by processing speech with a speech encoder (last frame, replacement frame) , said apparatus comprising an analyzer for processing said digitized speech bit stream to generate an angular frequency and magnitude for each of a plurality of sinusoidal voiced frequency components representing the speech processed by the speech encoder , said analyzer generating said angular frequencies and magnitudes over a sequence of times ;
a random signal generator for generating a time sequence of random phase components ;
a phase synthesizer for generating a time sequence of synthesized phases for at least some of said sinusoidal voiced frequency components , said synthesized phases being generated from said angular frequencies and random phase components ;
a first synthesizer for synthesizing the voiced frequency components of speech from said time sequences of angular frequencies , magnitudes , and synthesized phases ;
and a second synthesizer for synthesizing unvoiced frequency components representing the speech processed by the speech encoder , using a technique different from the technique used for synthesizing the voiced frequency components ;
wherein the speech signal is synthesized by combining synthesized voiced and unvoiced frequency components coexisting at the same time instants .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5664051A
CLAIM 1
. A speech decoder apparatus for synthesizing a speech signal (speech signal, decoder determines concealment) from a digitized speech bit stream of the type produced by processing speech with a speech encoder , said apparatus comprising an analyzer for processing said digitized speech bit stream to generate an angular frequency and magnitude for each of a plurality of sinusoidal voiced frequency components representing the speech processed by the speech encoder , said analyzer generating said angular frequencies and magnitudes over a sequence of times ;
a random signal generator for generating a time sequence of random phase components ;
a phase synthesizer for generating a time sequence of synthesized phases for at least some of said sinusoidal voiced frequency components , said synthesized phases being generated from said angular frequencies and random phase components ;
a first synthesizer for synthesizing the voiced frequency components of speech from said time sequences of angular frequencies , magnitudes , and synthesized phases ;
and a second synthesizer for synthesizing unvoiced frequency components representing the speech processed by the speech encoder , using a technique different from the technique used for synthesizing the voiced frequency components ;
wherein the speech signal is synthesized by combining synthesized voiced and unvoiced frequency components coexisting at the same time instants .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5664051A
CLAIM 1
. A speech decoder apparatus for synthesizing a speech signal (speech signal, decoder determines concealment) from a digitized speech bit stream of the type produced by processing speech with a speech encoder , said apparatus comprising an analyzer for processing said digitized speech bit stream to generate an angular frequency and magnitude for each of a plurality of sinusoidal voiced frequency components representing the speech processed by the speech encoder , said analyzer generating said angular frequencies and magnitudes over a sequence of times ;
a random signal generator for generating a time sequence of random phase components ;
a phase synthesizer for generating a time sequence of synthesized phases for at least some of said sinusoidal voiced frequency components , said synthesized phases being generated from said angular frequencies and random phase components ;
a first synthesizer for synthesizing the voiced frequency components of speech from said time sequences of angular frequencies , magnitudes , and synthesized phases ;
and a second synthesizer for synthesizing unvoiced frequency components representing the speech processed by the speech encoder , using a technique different from the technique used for synthesizing the voiced frequency components ;
wherein the speech signal is synthesized by combining synthesized voiced and unvoiced frequency components coexisting at the same time instants .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5664051A
CLAIM 1
. A speech decoder apparatus for synthesizing a speech signal from a digitized speech bit stream of the type produced by processing speech with a speech encoder (last frame, replacement frame) , said apparatus comprising an analyzer for processing said digitized speech bit stream to generate an angular frequency and magnitude for each of a plurality of sinusoidal voiced frequency components representing the speech processed by the speech encoder , said analyzer generating said angular frequencies and magnitudes over a sequence of times ;
a random signal generator for generating a time sequence of random phase components ;
a phase synthesizer for generating a time sequence of synthesized phases for at least some of said sinusoidal voiced frequency components , said synthesized phases being generated from said angular frequencies and random phase components ;
a first synthesizer for synthesizing the voiced frequency components of speech from said time sequences of angular frequencies , magnitudes , and synthesized phases ;
and a second synthesizer for synthesizing unvoiced frequency components representing the speech processed by the speech encoder , using a technique different from the technique used for synthesizing the voiced frequency components ;
wherein the speech signal is synthesized by combining synthesized voiced and unvoiced frequency components coexisting at the same time instants .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5664051A
CLAIM 1
. A speech decoder apparatus for synthesizing a speech signal from a digitized speech bit stream of the type produced by processing speech with a speech encoder (last frame, replacement frame) , said apparatus comprising an analyzer for processing said digitized speech bit stream to generate an angular frequency and magnitude for each of a plurality of sinusoidal voiced frequency components representing the speech processed by the speech encoder , said analyzer generating said angular frequencies and magnitudes over a sequence of times ;
a random signal generator for generating a time sequence of random phase components ;
a phase synthesizer for generating a time sequence of synthesized phases for at least some of said sinusoidal voiced frequency components , said synthesized phases being generated from said angular frequencies and random phase components ;
a first synthesizer for synthesizing the voiced frequency components of speech from said time sequences of angular frequencies , magnitudes , and synthesized phases ;
and a second synthesizer for synthesizing unvoiced frequency components representing the speech processed by the speech encoder , using a technique different from the technique used for synthesizing the voiced frequency components ;
wherein the speech signal is synthesized by combining synthesized voiced and unvoiced frequency components coexisting at the same time instants .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5664051A
CLAIM 1
. A speech decoder apparatus for synthesizing a speech signal (speech signal, decoder determines concealment) from a digitized speech bit stream of the type produced by processing speech with a speech encoder , said apparatus comprising an analyzer for processing said digitized speech bit stream to generate an angular frequency and magnitude for each of a plurality of sinusoidal voiced frequency components representing the speech processed by the speech encoder , said analyzer generating said angular frequencies and magnitudes over a sequence of times ;
a random signal generator for generating a time sequence of random phase components ;
a phase synthesizer for generating a time sequence of synthesized phases for at least some of said sinusoidal voiced frequency components , said synthesized phases being generated from said angular frequencies and random phase components ;
a first synthesizer for synthesizing the voiced frequency components of speech from said time sequences of angular frequencies , magnitudes , and synthesized phases ;
and a second synthesizer for synthesizing unvoiced frequency components representing the speech processed by the speech encoder , using a technique different from the technique used for synthesizing the voiced frequency components ;
wherein the speech signal is synthesized by combining synthesized voiced and unvoiced frequency components coexisting at the same time instants .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (speech encoder) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5664051A
CLAIM 1
. A speech decoder apparatus for synthesizing a speech signal from a digitized speech bit stream of the type produced by processing speech with a speech encoder (last frame, replacement frame) , said apparatus comprising an analyzer for processing said digitized speech bit stream to generate an angular frequency and magnitude for each of a plurality of sinusoidal voiced frequency components representing the speech processed by the speech encoder , said analyzer generating said angular frequencies and magnitudes over a sequence of times ;
a random signal generator for generating a time sequence of random phase components ;
a phase synthesizer for generating a time sequence of synthesized phases for at least some of said sinusoidal voiced frequency components , said synthesized phases being generated from said angular frequencies and random phase components ;
a first synthesizer for synthesizing the voiced frequency components of speech from said time sequences of angular frequencies , magnitudes , and synthesized phases ;
and a second synthesizer for synthesizing unvoiced frequency components representing the speech processed by the speech encoder , using a technique different from the technique used for synthesizing the voiced frequency components ;
wherein the speech signal is synthesized by combining synthesized voiced and unvoiced frequency components coexisting at the same time instants .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5664051A
CLAIM 1
. A speech decoder apparatus for synthesizing a speech signal (speech signal, decoder determines concealment) from a digitized speech bit stream of the type produced by processing speech with a speech encoder , said apparatus comprising an analyzer for processing said digitized speech bit stream to generate an angular frequency and magnitude for each of a plurality of sinusoidal voiced frequency components representing the speech processed by the speech encoder , said analyzer generating said angular frequencies and magnitudes over a sequence of times ;
a random signal generator for generating a time sequence of random phase components ;
a phase synthesizer for generating a time sequence of synthesized phases for at least some of said sinusoidal voiced frequency components , said synthesized phases being generated from said angular frequencies and random phase components ;
a first synthesizer for synthesizing the voiced frequency components of speech from said time sequences of angular frequencies , magnitudes , and synthesized phases ;
and a second synthesizer for synthesizing unvoiced frequency components representing the speech processed by the speech encoder , using a technique different from the technique used for synthesizing the voiced frequency components ;
wherein the speech signal is synthesized by combining synthesized voiced and unvoiced frequency components coexisting at the same time instants .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5664051A
CLAIM 1
. A speech decoder apparatus for synthesizing a speech signal (speech signal, decoder determines concealment) from a digitized speech bit stream of the type produced by processing speech with a speech encoder , said apparatus comprising an analyzer for processing said digitized speech bit stream to generate an angular frequency and magnitude for each of a plurality of sinusoidal voiced frequency components representing the speech processed by the speech encoder , said analyzer generating said angular frequencies and magnitudes over a sequence of times ;
a random signal generator for generating a time sequence of random phase components ;
a phase synthesizer for generating a time sequence of synthesized phases for at least some of said sinusoidal voiced frequency components , said synthesized phases being generated from said angular frequencies and random phase components ;
a first synthesizer for synthesizing the voiced frequency components of speech from said time sequences of angular frequencies , magnitudes , and synthesized phases ;
and a second synthesizer for synthesizing unvoiced frequency components representing the speech processed by the speech encoder , using a technique different from the technique used for synthesizing the voiced frequency components ;
wherein the speech signal is synthesized by combining synthesized voiced and unvoiced frequency components coexisting at the same time instants .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5664051A
CLAIM 1
. A speech decoder apparatus for synthesizing a speech signal from a digitized speech bit stream of the type produced by processing speech with a speech encoder (last frame, replacement frame) , said apparatus comprising an analyzer for processing said digitized speech bit stream to generate an angular frequency and magnitude for each of a plurality of sinusoidal voiced frequency components representing the speech processed by the speech encoder , said analyzer generating said angular frequencies and magnitudes over a sequence of times ;
a random signal generator for generating a time sequence of random phase components ;
a phase synthesizer for generating a time sequence of synthesized phases for at least some of said sinusoidal voiced frequency components , said synthesized phases being generated from said angular frequencies and random phase components ;
a first synthesizer for synthesizing the voiced frequency components of speech from said time sequences of angular frequencies , magnitudes , and synthesized phases ;
and a second synthesizer for synthesizing unvoiced frequency components representing the speech processed by the speech encoder , using a technique different from the technique used for synthesizing the voiced frequency components ;
wherein the speech signal is synthesized by combining synthesized voiced and unvoiced frequency components coexisting at the same time instants .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5664051A
CLAIM 1
. A speech decoder apparatus for synthesizing a speech signal (speech signal, decoder determines concealment) from a digitized speech bit stream of the type produced by processing speech with a speech encoder , said apparatus comprising an analyzer for processing said digitized speech bit stream to generate an angular frequency and magnitude for each of a plurality of sinusoidal voiced frequency components representing the speech processed by the speech encoder , said analyzer generating said angular frequencies and magnitudes over a sequence of times ;
a random signal generator for generating a time sequence of random phase components ;
a phase synthesizer for generating a time sequence of synthesized phases for at least some of said sinusoidal voiced frequency components , said synthesized phases being generated from said angular frequencies and random phase components ;
a first synthesizer for synthesizing the voiced frequency components of speech from said time sequences of angular frequencies , magnitudes , and synthesized phases ;
and a second synthesizer for synthesizing unvoiced frequency components representing the speech processed by the speech encoder , using a technique different from the technique used for synthesizing the voiced frequency components ;
wherein the speech signal is synthesized by combining synthesized voiced and unvoiced frequency components coexisting at the same time instants .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (speech encoder) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (speech encoder) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5664051A
CLAIM 1
. A speech decoder apparatus for synthesizing a speech signal from a digitized speech bit stream of the type produced by processing speech with a speech encoder (last frame, replacement frame) , said apparatus comprising an analyzer for processing said digitized speech bit stream to generate an angular frequency and magnitude for each of a plurality of sinusoidal voiced frequency components representing the speech processed by the speech encoder , said analyzer generating said angular frequencies and magnitudes over a sequence of times ;
a random signal generator for generating a time sequence of random phase components ;
a phase synthesizer for generating a time sequence of synthesized phases for at least some of said sinusoidal voiced frequency components , said synthesized phases being generated from said angular frequencies and random phase components ;
a first synthesizer for synthesizing the voiced frequency components of speech from said time sequences of angular frequencies , magnitudes , and synthesized phases ;
and a second synthesizer for synthesizing unvoiced frequency components representing the speech processed by the speech encoder , using a technique different from the technique used for synthesizing the voiced frequency components ;
wherein the speech signal is synthesized by combining synthesized voiced and unvoiced frequency components coexisting at the same time instants .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5598506A

Filed: 1994-06-10     Issued: 1997-01-28

Apparatus and a method for concealing transmission errors in a speech decoder

(Original Assignee) Telefonaktiebolaget LM Ericsson AB     (Current Assignee) Telefonaktiebolaget LM Ericsson AB

Karl T. Wigren, Rolf A. Bergstrom
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery (transmission error) comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US5598506A
CLAIM 17
. The method of claim 15 , said filter parameter modifying means further including a stationarity detector connected to an output of said voice activity detector for discriminating between stationary and non-stationary background sounds , wherein said concealing step comprises i (LP filter) nhibiting updating of the stationari- ty/non-stationarity decision obtained from the previous frame if said detected frame is declared non-acceptable in said deciding step .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5598506A
CLAIM 17
. The method of claim 15 , said filter parameter modifying means further including a stationarity detector connected to an output of said voice activity detector for discriminating between stationary and non-stationary background sounds , wherein said concealing step comprises i (LP filter) nhibiting updating of the stationari- ty/non-stationarity decision obtained from the previous frame if said detected frame is declared non-acceptable in said deciding step .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery (transmission error) in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US5598506A
CLAIM 17
. The method of claim 15 , said filter parameter modifying means further including a stationarity detector connected to an output of said voice activity detector for discriminating between stationary and non-stationary background sounds , wherein said concealing step comprises i (LP filter) nhibiting updating of the stationari- ty/non-stationarity decision obtained from the previous frame if said detected frame is declared non-acceptable in said deciding step .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery (transmission error) , limits to a given value a gain used for scaling the synthesized sound signal .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery (transmission error) in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US5598506A
CLAIM 17
. The method of claim 15 , said filter parameter modifying means further including a stationarity detector connected to an output of said voice activity detector for discriminating between stationary and non-stationary background sounds , wherein said concealing step comprises i (LP filter) nhibiting updating of the stationari- ty/non-stationarity decision obtained from the previous frame if said detected frame is declared non-acceptable in said deciding step .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (comprises i) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5598506A
CLAIM 17
. The method of claim 15 , said filter parameter modifying means further including a stationarity detector connected to an output of said voice activity detector for discriminating between stationary and non-stationary background sounds , wherein said concealing step comprises i (LP filter) nhibiting updating of the stationari- ty/non-stationarity decision obtained from the previous frame if said detected frame is declared non-acceptable in said deciding step .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (transmission error) and decoder recovery (transmission error) in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (comprises i) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5598506A
CLAIM 1
. An apparatus in a receiver in a frame based radio communication system , for concealing transmission error (decoder concealment, decoder recovery, frame concealment, decoder determines concealment) s in a speech decoder caused by a communication channel , which speech decoder is of the source-filter type and is controlled by means including internal state variables updated on a frame by frame basis for modifying received filter parameters representing background sounds transmitted over said communication channel , said apparatus comprising : (a) means for detecting frames containing transmission errors ;
(b) means for deciding whether a frame in which transmission errors have been detected is acceptable ;
(c) means for concealing said detected transmission errors by restricting updating of at least one of said internal state variables of said speech decoder if said detected frame is declared non-acceptable by said deciding means .

US5598506A
CLAIM 17
. The method of claim 15 , said filter parameter modifying means further including a stationarity detector connected to an output of said voice activity detector for discriminating between stationary and non-stationary background sounds , wherein said concealing step comprises i (LP filter) nhibiting updating of the stationari- ty/non-stationarity decision obtained from the previous frame if said detected frame is declared non-acceptable in said deciding step .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5734789A

Filed: 1994-04-18     Issued: 1998-03-31

Voiced, unvoiced or noise modes in a CELP vocoder

(Original Assignee) Hughes Electronics Corp     (Current Assignee) JPMorgan Chase Bank NA ; Hughes Network Systems LLC

Kumar Swaminathan, Kalyan Ganesan, Prabhat K. Gupta
US7693710B2
CLAIM 1
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (second intermediate) per sample for other frames .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US5734789A
CLAIM 3
. The method of claim 2 , wherein a second speech characteristic measured is spectral stationarity , and the method further comprises the steps of : comparing the measured energy with at least two intermediate thresholds representing energy values between the high energy value and the low energy value , the first intermediate threshold representing an energy value higher than the energy value represented by the second intermediate (average energy) threshold ;
setting a third energy flag if the measured energy is below the first intermediate threshold ;
setting a fourth energy flag if the measured energy is below the second intermediate threshold ;
measuring a spectral stationarity for the frame ;
setting a first spectral stationarity flag if the spectral stationarity measurement strongly indicates spectral stationarity ;
setting a second spectral stationarity flag if the spectral stationarity measurement weakly indicates spectral stationarity , wherein the frame is determined to lack a substantial speech component if the first spectral stationarity flag is set and the third energy flag is set ;
or the second spectral stationarity flag is set and the fourth energy flag is set .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (first speech) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (first speech) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (high value) and the first non erased frame received after frame erasure is encoded as active speech .
US5734789A
CLAIM 1
. A method of processing a signal having a speech component , the signal being organized as a plurality of frames , the method comprising the steps , performed for each frame , of : measuring a value for at least one speech characteristic of a frame , wherein the speech characteristic is selected from the group consisting of spectral stationarity , pitch stationarity , high-frequency content , and energy ;
comparing the measured value of the selected speech characteristic with at least two thresholds , including a high threshold representing a high value (comfort noise) of the selected speech characteristic and a low threshold representing a low value of the selected speech characteristic ;
and setting a first flag if the measured value exceeds the high threshold ;
and setting a second flag if the measured energy value is below the low threshold ;
determining whether the frame lacks a substantial speech component based on the determined flags ;
classifying the frame in a noise mode if the frame lacks a substantial speech component , and in a speech mode otherwise ;
and generating an encoded frame in accordance with a noise mode coding scheme if the frame is classified in the noise mode , and in accordance with a speech coding scheme if the frame is classified in the speech mode .

US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (characteristic value) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (first speech) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US5734789A
CLAIM 17
. An encoder for encoding a signal having a speech component , the signal being organized as a plurality of frames , comprising : means for measuring a value for at least one speech characteristic of a frame from among the plurality of frames , wherein the speech characteristic is selected from the group consisting of spectral stationarity , pitch stationarity , high-frequency content , and energy ;
a speech characteristic value (current frame) measurer for comparing the measured value of the selected speech characteristic with at least two thresholds , including a high threshold representing a high value of the selected speech characteristic and a low threshold representing a low value of the selected speech characteristic , setting a first flag if the measured value exceeds the high threshold , and setting a second flag if the measured value falls below the low threshold ;
means for determining whether the frame lacks a substantial speech component based on an evaluation of the determined flags ;
a mode classifier for classifying the frame in a noise mode if the frame lacks a substantial speech component , and in a speech mode otherwise ;
and a frame encoder for generating an encoded frame in accordance with a noise mode coding scheme when the frame is classified in the noise mode , and in accordance with a speech coding scheme when the frame is classified in the speech mode .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (first speech) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (characteristic value) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US5734789A
CLAIM 17
. An encoder for encoding a signal having a speech component , the signal being organized as a plurality of frames , comprising : means for measuring a value for at least one speech characteristic of a frame from among the plurality of frames , wherein the speech characteristic is selected from the group consisting of spectral stationarity , pitch stationarity , high-frequency content , and energy ;
a speech characteristic value (current frame) measurer for comparing the measured value of the selected speech characteristic with at least two thresholds , including a high threshold representing a high value of the selected speech characteristic and a low threshold representing a low value of the selected speech characteristic , setting a first flag if the measured value exceeds the high threshold , and setting a second flag if the measured value falls below the low threshold ;
means for determining whether the frame lacks a substantial speech component based on an evaluation of the determined flags ;
a mode classifier for classifying the frame in a noise mode if the frame lacks a substantial speech component , and in a speech mode otherwise ;
and a frame encoder for generating an encoded frame in accordance with a noise mode coding scheme when the frame is classified in the noise mode , and in accordance with a speech coding scheme when the frame is classified in the speech mode .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (second intermediate) per sample for other frames .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US5734789A
CLAIM 3
. The method of claim 2 , wherein a second speech characteristic measured is spectral stationarity , and the method further comprises the steps of : comparing the measured energy with at least two intermediate thresholds representing energy values between the high energy value and the low energy value , the first intermediate threshold representing an energy value higher than the energy value represented by the second intermediate (average energy) threshold ;
setting a third energy flag if the measured energy is below the first intermediate threshold ;
setting a fourth energy flag if the measured energy is below the second intermediate threshold ;
measuring a spectral stationarity for the frame ;
setting a first spectral stationarity flag if the spectral stationarity measurement strongly indicates spectral stationarity ;
setting a second spectral stationarity flag if the spectral stationarity measurement weakly indicates spectral stationarity , wherein the frame is determined to lack a substantial speech component if the first spectral stationarity flag is set and the third energy flag is set ;
or the second spectral stationarity flag is set and the fourth energy flag is set .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (first speech) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (first speech) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (high value) and the first non erased frame received after frame erasure is encoded as active speech .
US5734789A
CLAIM 1
. A method of processing a signal having a speech component , the signal being organized as a plurality of frames , the method comprising the steps , performed for each frame , of : measuring a value for at least one speech characteristic of a frame , wherein the speech characteristic is selected from the group consisting of spectral stationarity , pitch stationarity , high-frequency content , and energy ;
comparing the measured value of the selected speech characteristic with at least two thresholds , including a high threshold representing a high value (comfort noise) of the selected speech characteristic and a low threshold representing a low value of the selected speech characteristic ;
and setting a first flag if the measured value exceeds the high threshold ;
and setting a second flag if the measured energy value is below the low threshold ;
determining whether the frame lacks a substantial speech component based on the determined flags ;
classifying the frame in a noise mode if the frame lacks a substantial speech component , and in a speech mode otherwise ;
and generating an encoded frame in accordance with a noise mode coding scheme if the frame is classified in the noise mode , and in accordance with a speech coding scheme if the frame is classified in the speech mode .

US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (characteristic value) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (first speech) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US5734789A
CLAIM 17
. An encoder for encoding a signal having a speech component , the signal being organized as a plurality of frames , comprising : means for measuring a value for at least one speech characteristic of a frame from among the plurality of frames , wherein the speech characteristic is selected from the group consisting of spectral stationarity , pitch stationarity , high-frequency content , and energy ;
a speech characteristic value (current frame) measurer for comparing the measured value of the selected speech characteristic with at least two thresholds , including a high threshold representing a high value of the selected speech characteristic and a low threshold representing a low value of the selected speech characteristic , setting a first flag if the measured value exceeds the high threshold , and setting a second flag if the measured value falls below the low threshold ;
means for determining whether the frame lacks a substantial speech component based on an evaluation of the determined flags ;
a mode classifier for classifying the frame in a noise mode if the frame lacks a substantial speech component , and in a speech mode otherwise ;
and a frame encoder for generating an encoded frame in accordance with a noise mode coding scheme when the frame is classified in the noise mode , and in accordance with a speech coding scheme when the frame is classified in the speech mode .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (second intermediate) per sample for other frames .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US5734789A
CLAIM 3
. The method of claim 2 , wherein a second speech characteristic measured is spectral stationarity , and the method further comprises the steps of : comparing the measured energy with at least two intermediate thresholds representing energy values between the high energy value and the low energy value , the first intermediate threshold representing an energy value higher than the energy value represented by the second intermediate (average energy) threshold ;
setting a third energy flag if the measured energy is below the first intermediate threshold ;
setting a fourth energy flag if the measured energy is below the second intermediate threshold ;
measuring a spectral stationarity for the frame ;
setting a first spectral stationarity flag if the spectral stationarity measurement strongly indicates spectral stationarity ;
setting a second spectral stationarity flag if the spectral stationarity measurement weakly indicates spectral stationarity , wherein the frame is determined to lack a substantial speech component if the first spectral stationarity flag is set and the third energy flag is set ;
or the second spectral stationarity flag is set and the fourth energy flag is set .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (first speech) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (characteristic value) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5734789A
CLAIM 2
. The method of claim 1 , wherein a first speech (frame erasure, concealing frame erasure) characteristic measured is energy , wherein the first flag is a first energy flag and the second flag is a second energy flag ;
and wherein the frame is determined to lack a substantial speech component if the second energy flag is set , and is determined to contain a substantial speech component if the first energy flag is set .

US5734789A
CLAIM 17
. An encoder for encoding a signal having a speech component , the signal being organized as a plurality of frames , comprising : means for measuring a value for at least one speech characteristic of a frame from among the plurality of frames , wherein the speech characteristic is selected from the group consisting of spectral stationarity , pitch stationarity , high-frequency content , and energy ;
a speech characteristic value (current frame) measurer for comparing the measured value of the selected speech characteristic with at least two thresholds , including a high threshold representing a high value of the selected speech characteristic and a low threshold representing a low value of the selected speech characteristic , setting a first flag if the measured value exceeds the high threshold , and setting a second flag if the measured value falls below the low threshold ;
means for determining whether the frame lacks a substantial speech component based on an evaluation of the determined flags ;
a mode classifier for classifying the frame in a noise mode if the frame lacks a substantial speech component , and in a speech mode otherwise ;
and a frame encoder for generating an encoded frame in accordance with a noise mode coding scheme when the frame is classified in the noise mode , and in accordance with a speech coding scheme when the frame is classified in the speech mode .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5717823A

Filed: 1994-04-14     Issued: 1998-02-10

Speech-rate modification for linear-prediction based analysis-by-synthesis speech coders

(Original Assignee) Nokia of America Corp     (Current Assignee) Nokia of America Corp

Willem Bastiaan Kleijn
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US5717823A
CLAIM 21
. A method of providing a telecommunications network messaging service , the service for playing recorded messages to a message recipient at a network terminal , the network including a node having a corresponding memory storing coded speech information , the coded speech information representing a speech message recorded for the message recipient and comprising coded excitation parameters and coded linear prediction parameters , the network node responsive to control signal (maximum amplitude) s from the network terminal for playing an audible version of the recorded message , the method comprising the steps of : receiving at the node a control signal from the network terminal , the control signal requesting a modification of speech-rate of the recorded message ;
synthesizing an original speech-rate excitation signal based on one or more of the coded excitation parameters stored in the memory ;
responsive to the control signal , generating a modified speech-rate excitation signal based on the synthesized original speech-rate excitation signal ;
filtering the modified speech-rate excitation signal based on one or more of the coded linear prediction parameters to generate a decoded speech signal having a modified speech-rate as compared to the recorded message ;
and transmitting the decoded speech signal to the network terminal .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (fixed codebook) per sample for other frames .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US5717823A
CLAIM 10
. The method of claim 1 wherein the coded excitation parameters comprise a fixed codebook (average energy) index and a fixed codebook gain index .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US5717823A
CLAIM 21
. A method of providing a telecommunications network messaging service , the service for playing recorded messages to a message recipient at a network terminal , the network including a node having a corresponding memory storing coded speech information , the coded speech information representing a speech message recorded for the message recipient and comprising coded excitation parameters and coded linear prediction parameters , the network node responsive to control signal (maximum amplitude) s from the network terminal for playing an audible version of the recorded message , the method comprising the steps of : receiving at the node a control signal from the network terminal , the control signal requesting a modification of speech-rate of the recorded message ;
synthesizing an original speech-rate excitation signal based on one or more of the coded excitation parameters stored in the memory ;
responsive to the control signal , generating a modified speech-rate excitation signal based on the synthesized original speech-rate excitation signal ;
filtering the modified speech-rate excitation signal based on one or more of the coded linear prediction parameters to generate a decoded speech signal having a modified speech-rate as compared to the recorded message ;
and transmitting the decoded speech signal to the network terminal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US5717823A
CLAIM 21
. A method of providing a telecommunications network messaging service , the service for playing recorded messages to a message recipient at a network terminal , the network including a node having a corresponding memory storing coded speech information , the coded speech information representing a speech message recorded for the message recipient and comprising coded excitation parameters and coded linear prediction parameters , the network node responsive to control signal (maximum amplitude) s from the network terminal for playing an audible version of the recorded message , the method comprising the steps of : receiving at the node a control signal from the network terminal , the control signal requesting a modification of speech-rate of the recorded message ;
synthesizing an original speech-rate excitation signal based on one or more of the coded excitation parameters stored in the memory ;
responsive to the control signal , generating a modified speech-rate excitation signal based on the synthesized original speech-rate excitation signal ;
filtering the modified speech-rate excitation signal based on one or more of the coded linear prediction parameters to generate a decoded speech signal having a modified speech-rate as compared to the recorded message ;
and transmitting the decoded speech signal to the network terminal .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US5717823A
CLAIM 10
. The method of claim 1 wherein the coded excitation parameters comprise a fixed codebook (average energy) index and a fixed codebook gain index .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude (control signal) within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US5717823A
CLAIM 21
. A method of providing a telecommunications network messaging service , the service for playing recorded messages to a message recipient at a network terminal , the network including a node having a corresponding memory storing coded speech information , the coded speech information representing a speech message recorded for the message recipient and comprising coded excitation parameters and coded linear prediction parameters , the network node responsive to control signal (maximum amplitude) s from the network terminal for playing an audible version of the recorded message , the method comprising the steps of : receiving at the node a control signal from the network terminal , the control signal requesting a modification of speech-rate of the recorded message ;
synthesizing an original speech-rate excitation signal based on one or more of the coded excitation parameters stored in the memory ;
responsive to the control signal , generating a modified speech-rate excitation signal based on the synthesized original speech-rate excitation signal ;
filtering the modified speech-rate excitation signal based on one or more of the coded linear prediction parameters to generate a decoded speech signal having a modified speech-rate as compared to the recorded message ;
and transmitting the decoded speech signal to the network terminal .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .

US5717823A
CLAIM 10
. The method of claim 1 wherein the coded excitation parameters comprise a fixed codebook (average energy) index and a fixed codebook gain index .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5717823A
CLAIM 9
. The method of claim 1 wherein the coded excitation parameters comprise an adaptive codebook (sound signal, speech signal) index and an adaptive codebook gain index .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5450449A

Filed: 1994-03-14     Issued: 1995-09-12

Linear prediction coefficient generation during frame erasure or packet loss

(Original Assignee) AT&T IPM Corp     (Current Assignee) AT&T Corp ; Nokia of America Corp

Peter Kroon
US7693710B2
CLAIM 1
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse (frequency response) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (frequency response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5450449A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response (first impulse, impulse responses) of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal (speech signal, decoder determines concealment) , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (frame erasure) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5450449A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal (speech signal, decoder determines concealment) , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (frame erasure) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5450449A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient (comfort noise) signals for use by a linear prediction filter in synthesizing a speech signal (speech signal, decoder determines concealment) , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (frame erasure) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5450449A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (frame erasure) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse (frequency response) response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (frequency response) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5450449A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response (first impulse, impulse responses) of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal (speech signal, decoder determines concealment) , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (frame erasure) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5450449A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal (speech signal, decoder determines concealment) , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (frame erasure) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5450449A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient (comfort noise) signals for use by a linear prediction filter in synthesizing a speech signal (speech signal, decoder determines concealment) , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (frame erasure) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5450449A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal (speech signal, decoder determines concealment) , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (frame erasure) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5450449A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , modifying the stored linear prediction coefficient signals to expand the bandwidth of one or more peaks in a frequency response of the linear prediction filter , the modified linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5574825A

Filed: 1994-03-14     Issued: 1996-11-12

Linear prediction coefficient generation during frame erasure or packet loss

(Original Assignee) Nokia of America Corp     (Current Assignee) Nokia of America Corp

Juin-Hwey Chen, Craig R. Watkins
US7693710B2
CLAIM 1
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5574825A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal (speech signal, decoder determines concealment) , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (frame erasure) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5574825A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal (speech signal, decoder determines concealment) , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal is a speech signal (speech signal) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (frame erasure) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5574825A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient (comfort noise) signals for use by a linear prediction filter in synthesizing a speech signal (speech signal, decoder determines concealment) , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (frame erasure) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5574825A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (frame erasure) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5574825A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal (speech signal, decoder determines concealment) , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (frame erasure) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5574825A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal (speech signal, decoder determines concealment) , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (frame erasure) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise (linear prediction coefficient) and the first non erased frame received after frame erasure is encoded as active speech .
US5574825A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient (comfort noise) signals for use by a linear prediction filter in synthesizing a speech signal (speech signal, decoder determines concealment) , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (frame erasure) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5574825A
CLAIM 1
. A method of generating linear prediction filter coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (frame erasure) caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (speech signal) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal (speech signal, decoder determines concealment) , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (frame erasure) caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter (linear prediction filter) , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5574825A
CLAIM 1
. A method of generating linear prediction filter (signal classification parameter) coefficient signals during frame erasure (frame erasure) , the generated linear prediction coefficient signals for use by a linear prediction filter in synthesizing a speech signal , the method comprising the steps of : storing linear prediction coefficient signals in a memory , said linear prediction coefficient signals generated responsive to a speech signal corresponding to a non-erased frame ;
and responsive to a frame erasure , scaling one or more of said stored linear prediction coefficient signals by a scale factor , BEF raised to an exponent i , where 0 . 95≦BEF≦0 . 99 and where i indexes the stored linear prediction coefficient signals , the scaled linear prediction coefficient signals applied to the linear prediction filter for use in synthesizing the speech signal .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5517595A

Filed: 1994-02-08     Issued: 1996-05-14

Decomposition in noise and periodic signal waveforms in waveform interpolation

(Original Assignee) AT&T Corp     (Current Assignee) AT&T Corp

Willem B. Kleijn
US7693710B2
CLAIM 1
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (said signals) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (signal samples) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5517595A
CLAIM 1
. A method of coding a speech signal , the method comprising the steps of : 1 . generating a time-ordered sequence of sets of parameters based on samples of the speech signal , each set of parameters corresponding to a waveform characterizing the speech signal ;
2 . grouping parameters of the plurality of sets based on index values for said parameters to form a first set of signals which set represents an evolution of characterizing waveform shape across the time-ordered sequence of sets ;
3 . filtering signals of the first set to remove low-frequency components of said signals (onset frame) evolving over time at low frequencies , wherein said filtering produces a second set of signals which second set represents relatively high rates of evolution of characterizing waveform shape ;
and

US5517595A
CLAIM 12
. The method of claim 1 wherein said parameters comprise time-domain signal samples (impulse responses) .

US7693710B2
CLAIM 2
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter (determining parameters) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US7693710B2
CLAIM 3
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter (determining parameters) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US7693710B2
CLAIM 4
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter (determining parameters) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy (fixed codebook) per sample for other frames .
US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US5517595A
CLAIM 20
. A method of coding a speech signal using a set of fixed codebook (average energy) s , the speech signal comprising sequential sets of samples of said speech signal , each set of samples specifying the value of said signals at a specific point in time , the method comprising the steps of : coding a first set of samples of the speech signal with a first codebook ;
and coding a different time-successive set of samples of the speech signal with a codebook other than said first codebook .

US7693710B2
CLAIM 5
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter (determining parameters) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US7693710B2
CLAIM 8
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter (determining parameters) related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (domain samples, represents a) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5517595A
CLAIM 1
. A method of coding a speech signal , the method comprising the steps of : 1 . generating a time-ordered sequence of sets of parameters based on samples of the speech signal , each set of parameters corresponding to a waveform characterizing the speech signal ;
2 . grouping parameters of the plurality of sets based on index values for said parameters to form a first set of signals which set represents a (LP filter, LP filter excitation signal) n evolution of characterizing waveform shape across the time-ordered sequence of sets ;
3 . filtering signals of the first set to remove low-frequency components of said signals evolving over time at low frequencies , wherein said filtering produces a second set of signals which second set represents relatively high rates of evolution of characterizing waveform shape ;
and

US5517595A
CLAIM 8
. The method of claim 6 wherein the values of a signal of the first set represent time-domain samples (LP filter, LP filter excitation signal) of characterizing waveforms .

US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter (domain samples, represents a) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (weighted average) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5517595A
CLAIM 1
. A method of coding a speech signal , the method comprising the steps of : 1 . generating a time-ordered sequence of sets of parameters based on samples of the speech signal , each set of parameters corresponding to a waveform characterizing the speech signal ;
2 . grouping parameters of the plurality of sets based on index values for said parameters to form a first set of signals which set represents a (LP filter, LP filter excitation signal) n evolution of characterizing waveform shape across the time-ordered sequence of sets ;
3 . filtering signals of the first set to remove low-frequency components of said signals evolving over time at low frequencies , wherein said filtering produces a second set of signals which second set represents relatively high rates of evolution of characterizing waveform shape ;
and

US5517595A
CLAIM 6
. The method of claim 5 wherein the step of smoothing comprises forming a weighted average (E q) of values of a signal of said first set .

US5517595A
CLAIM 8
. The method of claim 6 wherein the values of a signal of the first set represent time-domain samples (LP filter, LP filter excitation signal) of characterizing waveforms .

US7693710B2
CLAIM 10
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (determining parameters) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US7693710B2
CLAIM 11
. A method of concealing frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (determining parameters) related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (determining parameters) related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter (domain samples, represents a) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (weighted average) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5517595A
CLAIM 1
. A method of coding a speech signal , the method comprising the steps of : 1 . generating a time-ordered sequence of sets of parameters based on samples of the speech signal , each set of parameters corresponding to a waveform characterizing the speech signal ;
2 . grouping parameters of the plurality of sets based on index values for said parameters to form a first set of signals which set represents a (LP filter, LP filter excitation signal) n evolution of characterizing waveform shape across the time-ordered sequence of sets ;
3 . filtering signals of the first set to remove low-frequency components of said signals evolving over time at low frequencies , wherein said filtering produces a second set of signals which second set represents relatively high rates of evolution of characterizing waveform shape ;
and

US5517595A
CLAIM 6
. The method of claim 5 wherein the step of smoothing comprises forming a weighted average (E q) of values of a signal of said first set .

US5517595A
CLAIM 8
. The method of claim 6 wherein the values of a signal of the first set represent time-domain samples (LP filter, LP filter excitation signal) of characterizing waveforms .

US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (said signals) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses (signal samples) of the low-pass filter each with a distance corresponding to an average pitch value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5517595A
CLAIM 1
. A method of coding a speech signal , the method comprising the steps of : 1 . generating a time-ordered sequence of sets of parameters based on samples of the speech signal , each set of parameters corresponding to a waveform characterizing the speech signal ;
2 . grouping parameters of the plurality of sets based on index values for said parameters to form a first set of signals which set represents an evolution of characterizing waveform shape across the time-ordered sequence of sets ;
3 . filtering signals of the first set to remove low-frequency components of said signals (onset frame) evolving over time at low frequencies , wherein said filtering produces a second set of signals which second set represents relatively high rates of evolution of characterizing waveform shape ;
and

US5517595A
CLAIM 12
. The method of claim 1 wherein said parameters comprise time-domain signal samples (impulse responses) .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (determining parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (determining parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (determining parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US5517595A
CLAIM 20
. A method of coding a speech signal using a set of fixed codebook (average energy) s , the speech signal comprising sequential sets of samples of said speech signal , each set of samples specifying the value of said signals at a specific point in time , the method comprising the steps of : coding a first set of samples of the speech signal with a first codebook ;
and coding a different time-successive set of samples of the speech signal with a codebook other than said first codebook .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (determining parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (determining parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter (domain samples, represents a) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5517595A
CLAIM 1
. A method of coding a speech signal , the method comprising the steps of : 1 . generating a time-ordered sequence of sets of parameters based on samples of the speech signal , each set of parameters corresponding to a waveform characterizing the speech signal ;
2 . grouping parameters of the plurality of sets based on index values for said parameters to form a first set of signals which set represents a (LP filter, LP filter excitation signal) n evolution of characterizing waveform shape across the time-ordered sequence of sets ;
3 . filtering signals of the first set to remove low-frequency components of said signals evolving over time at low frequencies , wherein said filtering produces a second set of signals which second set represents relatively high rates of evolution of characterizing waveform shape ;
and

US5517595A
CLAIM 8
. The method of claim 6 wherein the values of a signal of the first set represent time-domain samples (LP filter, LP filter excitation signal) of characterizing waveforms .

US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter (domain samples, represents a) excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (weighted average) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5517595A
CLAIM 1
. A method of coding a speech signal , the method comprising the steps of : 1 . generating a time-ordered sequence of sets of parameters based on samples of the speech signal , each set of parameters corresponding to a waveform characterizing the speech signal ;
2 . grouping parameters of the plurality of sets based on index values for said parameters to form a first set of signals which set represents a (LP filter, LP filter excitation signal) n evolution of characterizing waveform shape across the time-ordered sequence of sets ;
3 . filtering signals of the first set to remove low-frequency components of said signals evolving over time at low frequencies , wherein said filtering produces a second set of signals which second set represents relatively high rates of evolution of characterizing waveform shape ;
and

US5517595A
CLAIM 6
. The method of claim 5 wherein the step of smoothing comprises forming a weighted average (E q) of values of a signal of said first set .

US5517595A
CLAIM 8
. The method of claim 6 wherein the values of a signal of the first set represent time-domain samples (LP filter, LP filter excitation signal) of characterizing waveforms .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (determining parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (determining parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (determining parameters) related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy (fixed codebook) per sample for other frames .
US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .

US5517595A
CLAIM 20
. A method of coding a speech signal using a set of fixed codebook (average energy) s , the speech signal comprising sequential sets of samples of said speech signal , each set of samples specifying the value of said signals at a specific point in time , the method comprising the steps of : coding a first set of samples of the speech signal with a first codebook ;
and coding a different time-successive set of samples of the speech signal with a codebook other than said first codebook .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure caused by frames erased during transmission of a sound signal encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter (determining parameters) related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter (domain samples, represents a) of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (weighted average) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5517595A
CLAIM 1
. A method of coding a speech signal , the method comprising the steps of : 1 . generating a time-ordered sequence of sets of parameters based on samples of the speech signal , each set of parameters corresponding to a waveform characterizing the speech signal ;
2 . grouping parameters of the plurality of sets based on index values for said parameters to form a first set of signals which set represents a (LP filter, LP filter excitation signal) n evolution of characterizing waveform shape across the time-ordered sequence of sets ;
3 . filtering signals of the first set to remove low-frequency components of said signals evolving over time at low frequencies , wherein said filtering produces a second set of signals which second set represents relatively high rates of evolution of characterizing waveform shape ;
and

US5517595A
CLAIM 6
. The method of claim 5 wherein the step of smoothing comprises forming a weighted average (E q) of values of a signal of said first set .

US5517595A
CLAIM 8
. The method of claim 6 wherein the values of a signal of the first set represent time-domain samples (LP filter, LP filter excitation signal) of characterizing waveforms .

US5517595A
CLAIM 9
. The method of claim 1 wherein the step of coding comprises determining parameters (phase information parameter) corresponding to a second characterizing waveform based on the second set of signals and coding said speech signal based on said determined values .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5862518A

Filed: 1993-12-23     Issued: 1999-01-19

Speech decoder for decoding a speech signal using a bad frame masking unit for voiced frame and a bad frame masking unit for unvoiced frame

(Original Assignee) NEC Corp     (Current Assignee) NEC Corp

Toshiyuki Nomura, Kazunori Ozawa
US7693710B2
CLAIM 1
. A method of concealing frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame (current frames) is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (pitch period, error frame) ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (judging unit) from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5862518A
CLAIM 1
. A speech decoder , comprising : a receiving unit for receiving and outputting parameters of spectral data , pitch data corresponding to a pitch period (decoder concealment, pitch period, decoder determines concealment) , and index data , and gain data of an excitation signal for each frame having a predetermined interval of a speech signal ;
a speech decoder unit for reproducing a speech signal by using said parameters ;
an error correcting unit for correcting an error in said speech signal ;
an error detecting unit for detecting an error frame (decoder concealment, pitch period, decoder determines concealment) incapable of correction in said speech signal ;
a voiced/unvoiced frame judging unit (average pitch value) for judging whether said error frame detected by said error detecting unit is a voiced frame or an unvoiced frame based upon a plurality of feature quantities of a speech signal reproduced in a past frame ;
a bad frame masking unit for voiced frame for reproducing a speech signal of the error frame detected by said error detecting unit and which is judged as a voiced frame by using said spectral data , said pitch data and said gain data of the past frame , and said index data of said error frame ;
a bad frame masking unit for unvoiced frame for reproducing a speech signal of the error frame detected by said error detecting unit and which is judged as an unvoiced frame by using said spectral data and said gain data of the past frame and said index data of said error frame ;
and a switching unit for outputting one of the voiced frame and the unvoiced frame according to the judgment result in said voiced/unvoiced frame judging unit .

US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (pitch period, error frame) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5862518A
CLAIM 1
. A speech decoder , comprising : a receiving unit for receiving and outputting parameters of spectral data , pitch data corresponding to a pitch period (decoder concealment, pitch period, decoder determines concealment) , and index data , and gain data of an excitation signal for each frame having a predetermined interval of a speech signal ;
a speech decoder unit for reproducing a speech signal by using said parameters ;
an error correcting unit for correcting an error in said speech signal ;
an error detecting unit for detecting an error frame (decoder concealment, pitch period, decoder determines concealment) incapable of correction in said speech signal ;
a voiced/unvoiced frame judging unit for judging whether said error frame detected by said error detecting unit is a voiced frame or an unvoiced frame based upon a plurality of feature quantities of a speech signal reproduced in a past frame ;
a bad frame masking unit for voiced frame for reproducing a speech signal of the error frame detected by said error detecting unit and which is judged as a voiced frame by using said spectral data , said pitch data and said gain data of the past frame , and said index data of said error frame ;
a bad frame masking unit for unvoiced frame for reproducing a speech signal of the error frame detected by said error detecting unit and which is judged as an unvoiced frame by using said spectral data and said gain data of the past frame and said index data of said error frame ;
and a switching unit for outputting one of the voiced frame and the unvoiced frame according to the judgment result in said voiced/unvoiced frame judging unit .

US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (current frames) erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (current frames) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (current frames) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (current frames) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frames) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (current frames) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period (pitch period, error frame) as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5862518A
CLAIM 1
. A speech decoder , comprising : a receiving unit for receiving and outputting parameters of spectral data , pitch data corresponding to a pitch period (decoder concealment, pitch period, decoder determines concealment) , and index data , and gain data of an excitation signal for each frame having a predetermined interval of a speech signal ;
a speech decoder unit for reproducing a speech signal by using said parameters ;
an error correcting unit for correcting an error in said speech signal ;
an error detecting unit for detecting an error frame (decoder concealment, pitch period, decoder determines concealment) incapable of correction in said speech signal ;
a voiced/unvoiced frame judging unit for judging whether said error frame detected by said error detecting unit is a voiced frame or an unvoiced frame based upon a plurality of feature quantities of a speech signal reproduced in a past frame ;
a bad frame masking unit for voiced frame for reproducing a speech signal of the error frame detected by said error detecting unit and which is judged as a voiced frame by using said spectral data , said pitch data and said gain data of the past frame , and said index data of said error frame ;
a bad frame masking unit for unvoiced frame for reproducing a speech signal of the error frame detected by said error detecting unit and which is judged as an unvoiced frame by using said spectral data and said gain data of the past frame and said index data of said error frame ;
and a switching unit for outputting one of the voiced frame and the unvoiced frame according to the judgment result in said voiced/unvoiced frame judging unit .

US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (current frames) caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame (current frames) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (current frames) erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame (current frames) , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame (current frames) is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period (pitch period, error frame) ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch value (judging unit) from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5862518A
CLAIM 1
. A speech decoder , comprising : a receiving unit for receiving and outputting parameters of spectral data , pitch data corresponding to a pitch period (decoder concealment, pitch period, decoder determines concealment) , and index data , and gain data of an excitation signal for each frame having a predetermined interval of a speech signal ;
a speech decoder unit for reproducing a speech signal by using said parameters ;
an error correcting unit for correcting an error in said speech signal ;
an error detecting unit for detecting an error frame (decoder concealment, pitch period, decoder determines concealment) incapable of correction in said speech signal ;
a voiced/unvoiced frame judging unit (average pitch value) for judging whether said error frame detected by said error detecting unit is a voiced frame or an unvoiced frame based upon a plurality of feature quantities of a speech signal reproduced in a past frame ;
a bad frame masking unit for voiced frame for reproducing a speech signal of the error frame detected by said error detecting unit and which is judged as a voiced frame by using said spectral data , said pitch data and said gain data of the past frame , and said index data of said error frame ;
a bad frame masking unit for unvoiced frame for reproducing a speech signal of the error frame detected by said error detecting unit and which is judged as an unvoiced frame by using said spectral data and said gain data of the past frame and said index data of said error frame ;
and a switching unit for outputting one of the voiced frame and the unvoiced frame according to the judgment result in said voiced/unvoiced frame judging unit .

US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (pitch period, error frame) as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5862518A
CLAIM 1
. A speech decoder , comprising : a receiving unit for receiving and outputting parameters of spectral data , pitch data corresponding to a pitch period (decoder concealment, pitch period, decoder determines concealment) , and index data , and gain data of an excitation signal for each frame having a predetermined interval of a speech signal ;
a speech decoder unit for reproducing a speech signal by using said parameters ;
an error correcting unit for correcting an error in said speech signal ;
an error detecting unit for detecting an error frame (decoder concealment, pitch period, decoder determines concealment) incapable of correction in said speech signal ;
a voiced/unvoiced frame judging unit for judging whether said error frame detected by said error detecting unit is a voiced frame or an unvoiced frame based upon a plurality of feature quantities of a speech signal reproduced in a past frame ;
a bad frame masking unit for voiced frame for reproducing a speech signal of the error frame detected by said error detecting unit and which is judged as a voiced frame by using said spectral data , said pitch data and said gain data of the past frame , and said index data of said error frame ;
a bad frame masking unit for unvoiced frame for reproducing a speech signal of the error frame detected by said error detecting unit and which is judged as an unvoiced frame by using said spectral data and said gain data of the past frame and said index data of said error frame ;
and a switching unit for outputting one of the voiced frame and the unvoiced frame according to the judgment result in said voiced/unvoiced frame judging unit .

US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame (current frames) erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (current frames) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (current frames) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (current frames) erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frames) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (current frames) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period (pitch period, error frame) as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5862518A
CLAIM 1
. A speech decoder , comprising : a receiving unit for receiving and outputting parameters of spectral data , pitch data corresponding to a pitch period (decoder concealment, pitch period, decoder determines concealment) , and index data , and gain data of an excitation signal for each frame having a predetermined interval of a speech signal ;
a speech decoder unit for reproducing a speech signal by using said parameters ;
an error correcting unit for correcting an error in said speech signal ;
an error detecting unit for detecting an error frame (decoder concealment, pitch period, decoder determines concealment) incapable of correction in said speech signal ;
a voiced/unvoiced frame judging unit for judging whether said error frame detected by said error detecting unit is a voiced frame or an unvoiced frame based upon a plurality of feature quantities of a speech signal reproduced in a past frame ;
a bad frame masking unit for voiced frame for reproducing a speech signal of the error frame detected by said error detecting unit and which is judged as a voiced frame by using said spectral data , said pitch data and said gain data of the past frame , and said index data of said error frame ;
a bad frame masking unit for unvoiced frame for reproducing a speech signal of the error frame detected by said error detecting unit and which is judged as an unvoiced frame by using said spectral data and said gain data of the past frame and said index data of said error frame ;
and a switching unit for outputting one of the voiced frame and the unvoiced frame according to the judgment result in said voiced/unvoiced frame judging unit .

US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (current frames) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (current frames) caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame (current frames) selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame (current frames) erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame (current frames) , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5862518A
CLAIM 4
. A speech decoder , comprising : a receiving unit for receiving and outputting input data , the input data including spectral data transmitted for each of a plurality of frames , delay of an adaptive codebook (sound signal, speech signal) having a predetermined excitation signal corresponding to a pitch data , an index of excitation codebook constituting an excitation signal , gains of the adaptive and excitation codebooks and an amplitude of a speech signal ;
an error detection unit for checking whether an error of said each frame occurs based upon said corresponding input data having errors in perceptually important bits ;
a data memory for storing the input data after delaying the data by one frame ;
a speech decoder unit for decoding , when no error is detected by said error detection unit , the speech signal by using the spectral data , delay of the adaptive codebook having the predetermined excitation signal , index of the excitation codebook comprising the excitation signal , gains of the adaptive and excitation codebooks and the amplitude of the speech signal ;
a voiced/unvoiced frame judging unit for deriving a plurality of feature quantities from the speech signal that has been reproduced in said speech decoder unit in a previous frame and for checking whether a current frame is a voiced or unvoiced frame ;
a bad frame masking unit for voiced frame for interpolating , when an error is detected and the current frame is the voiced frame , the speech signal by using the data of the previous and current frames (last frame, current frame, frame erasure, onset frame, replacement frame) ;
and a bad frame masking unit for unvoiced frame for interpolating , when an error is detected and the current frame is the unvoiced frame , the speech signal by using data of the previous and current frames .




US7693710B2

Filed: 2002-05-31     Issued: 2010-04-06

Method and device for efficient frame erasure concealment in linear predictive based speech codecs

(Original Assignee) VoiceAge Corp     (Current Assignee) Voiceage Evs LLC

Milan Jelinek, Philippe Gournay
US5717824A

Filed: 1993-12-07     Issued: 1998-02-10

Adaptive speech coder having code excited linear predictor with multiple codebook searches

(Original Assignee) Pacific Communication Sciences Inc     (Current Assignee) Cirrus Logic Inc ; Mindspeed Technologies LLC ; AudioCodes Inc

Harprit S. Chhatwal
US7693710B2
CLAIM 1
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : conducting frame erasure concealment and decoder recovery comprises , when at least one onset frame is lost , constructing a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the method comprises quantizing a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and constructing the periodic excitation part comprises realizing the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch (high p) value from the preceding impulse response up to the end of a last subframe affected by the artificial construction of the periodic part .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US5717824A
CLAIM 21
. The speech coder of claim 19 , further comprising a high p (average pitch, E q) ass filter interposed between said first and second codebook search means and said weighting filter means .

US7693710B2
CLAIM 2
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 3
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 4
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and determining concealment/recovery parameters comprises calculating the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and calculating the energy information parameter in relation to an average energy per sample for other frames .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 5
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein conducting frame erasure concealment and decoder recovery comprises : controlling an energy of a synthesized sound signal produced by the decoder , controlling energy of the synthesized sound signal comprising scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and converging the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 6
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received after a frame erasure (first speech) is classified as onset , conducting frame erasure concealment and decoder recovery comprises limiting to a given value a gain used for scaling the synthesized sound signal .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 7
. A method as claimed in claim 5 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

determining , in the encoder , concealment/recovery parameters comprises classifying successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and said method comprising making a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (first speech) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 8
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter , and a phase information parameter related to the sound signal ;

transmitting to the decoder concealment/recovery parameters determined in the encoder ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to the received concealment/recovery parameters ;

wherein : the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 9
. A method as claimed in claim 8 wherein : adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame comprises using the following relation : E q (high p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure (first speech) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US5717824A
CLAIM 21
. The speech coder of claim 19 , further comprising a high p (average pitch, E q) ass filter interposed between said first and second codebook search means and said weighting filter means .

US7693710B2
CLAIM 10
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein the concealment/recovery parameters include the phase information parameter and wherein determination of the phase information parameter comprises : determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and encoding , in the encoder , a shape , sign and amplitude of the first glottal pulse and transmitting the encoded shape , sign and amplitude from the encoder to the decoder .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 11
. A method of concealing frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : determining , in the encoder , concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

determination of the phase information parameter comprises determining a position of a first glottal pulse in a frame of the encoded sound signal ;

and determining the position of the first glottal pulse comprises : measuring a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and quantizing a position of the sample of maximum amplitude within the pitch period .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 12
. A method for the concealment of frame erasure (first speech) caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , comprising : determining , in the decoder , concealment/recovery parameters from the signal-encoding parameters , wherein the concealment/recovery parameters are selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal and are used for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and in the decoder , conducting frame erasure concealment and decoder recovery in response to concealment/recovery parameters determined in the decoder ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and conducting frame erasure concealment and decoder recovery comprises , when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusting an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (high p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of the current frame , E LPO is an energy of an impulse response of the LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US5717824A
CLAIM 21
. The speech coder of claim 19 , further comprising a high p (average pitch, E q) ass filter interposed between said first and second codebook search means and said weighting filter means .

US7693710B2
CLAIM 13
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder : wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

for conducting frame erasure concealment and decoder recovery , the decoder constructs , when at least one onset frame is lost , a periodic excitation part artificially as a low-pass filtered periodic train of pulses separated by a pitch period ;

the device comprises a quantizer of a position of a first glottal pulse with respect to the beginning of the onset frame prior to transmission of said position of the first glottal pulse to the decoder ;

and the decoder , for constructing the periodic excitation part , realizes the low-pass filtered periodic train of pulses by : centering a first impulse response of a low-pass filter on the quantized position of the first glottal pulse with respect to the beginning of the onset frame ;

and placing remaining impulse responses of the low-pass filter each with a distance corresponding to an average pitch (high p) value from the preceding impulse response up to an end of a last subframe affected by the artificial construction of the periodic part .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US5717824A
CLAIM 21
. The speech coder of claim 19 , further comprising a high p (average pitch, E q) ass filter interposed between said first and second codebook search means and said weighting filter means .

US7693710B2
CLAIM 14
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 15
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse , and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 16
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 17
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to concealment/recovery parameters received from the encoder ;

and for conducting frame erasure concealment and decoder recovery : the decoder controls an energy of a synthesized sound signal produced by the decoder by scaling the synthesized sound signal to render an energy of said synthesized sound signal at the beginning of a first non erased frame received following frame erasure similar to an energy of said synthesized sound signal at the end of a last frame erased during said frame erasure ;

and the decoder converges the energy of the synthesized sound signal in the received first non erased frame to an energy corresponding to the received energy information parameter toward the end of said received first non erased frame while limiting an increase in energy .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 18
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and when the first non erased frame received following frame erasure (first speech) is classified as onset , the decoder , for conducting frame erasure concealment and decoder recovery , limits to a given value a gain used for scaling the synthesized sound signal .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 19
. A device as claimed in claim 17 , wherein : the sound signal (adaptive codebook) is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the decoder makes a gain used for scaling the synthesized sound signal at the beginning of the first non erased frame received after frame erasure (first speech) equal to a gain used at an end of said received first non erased frame : during a transition from a voiced frame to an unvoiced frame , in the case of a last non erased frame received before frame erasure classified as voiced transition , voice or onset and a first non erased frame received after frame erasure classified as unvoiced ;

and during a transition from a non-active speech period to an active speech period , when the last non erased frame received before frame erasure is encoded as comfort noise and the first non erased frame received after frame erasure is encoded as active speech .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 20
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the decoder conducts frame erasure concealment and decoder recovery in response to the concealment/recovery parameters received from the encoder ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , the decoder adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 21
. A device as claimed in claim 20 , wherein : the decoder , for adjusting the energy of the LP filter excitation signal produced in the decoder during the received first non erased frame to the gain of the LP filter of said received first non erased frame , uses the following relation : E q (high p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure (first speech) , and E LP1 is an energy of an impulse response of the LP filter of the received first non erased frame following frame erasure .
US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US5717824A
CLAIM 21
. The speech coder of claim 19 , further comprising a high p (average pitch, E q) ass filter interposed between said first and second codebook search means and said weighting filter means .

US7693710B2
CLAIM 22
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher encodes a shape , sign and amplitude of the first glottal pulse and the communication link transmits the encoded shape , sign and amplitude from the encoder to the decoder .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 23
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the concealment/recovery parameters include the phase information parameter ;

to determine the phase information parameter , the determiner comprises a searcher of a position of a first glottal pulse in a frame of the encoded sound signal ;

and the searcher measures a sample of maximum amplitude within a pitch period as the first glottal pulse ;

and the determiner comprises a quantizer of the position of the sample of maximum amplitude within the pitch period .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 24
. A device for conducting concealment of frame erasure (first speech) caused by frames of an encoded sound signal (adaptive codebook) erased during transmission from an encoder to a decoder , comprising : in the encoder , a determiner of concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal ;

and a communication link for transmitting to the decoder concealment/recovery parameters determined in the encoder ;

wherein : the sound signal is a speech signal (adaptive codebook) ;

the determiner of concealment/recovery parameters comprises a classifier of successive frames of the encoded sound signal as unvoiced , unvoiced transition , voiced transition , voiced , or onset ;

and the determiner of concealment/recovery parameters comprises a computer of the energy information parameter in relation to a maximum of a signal energy for frames classified as voiced or onset , and in relation to an average energy per sample for other frames .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US7693710B2
CLAIM 25
. A device for the concealment of frame erasure (first speech) caused by frames erased during transmission of a sound signal (adaptive codebook) encoded under the form of signal-encoding parameters from an encoder to a decoder , wherein : the decoder determines concealment/recovery parameters selected from the group consisting of a signal classification parameter , an energy information parameter and a phase information parameter related to the sound signal , for producing , upon occurrence of frame erasure , a replacement frame selected from the group consisting of a voiced frame , an unvoiced frame , and a frame defining a transition between voiced and unvoiced frames ;

and the decoder conducts erased frame concealment (error value) and decoder recovery in response to determined concealment/recovery parameters ;

wherein : the concealment/recovery parameters include the energy information parameter ;

the energy information parameter is not transmitted from the encoder to the decoder ;

and the decoder , for conducting frame erasure concealment and decoder recovery when a gain of a LP filter of a first non erased frame received following frame erasure is higher than a gain of a LP filter of a last frame erased during said frame erasure , adjusts an energy of an LP filter excitation signal produced in the decoder during the received first non erased frame to a gain of the LP filter of said received first non erased frame using the following relation : E q (high p) = E 1 ⁢ E LP ⁢ ⁢ 0 E LP ⁢ ⁢ 1 where E 1 is an energy at an end of a current frame , E LPO is an energy of an impulse response of a LP filter of a last non erased frame received before the frame erasure , and E LP1 is an energy of an impulse response of the LP filter to the received first non erased frame following frame erasure .
US5717824A
CLAIM 6
. The apparatus of claim 5 , wherein said third codebook means comprises adaptive codebook (sound signal, speech signal) means determines long term predictor information in relation to said target vector .

US5717824A
CLAIM 8
. The apparatus of claim 7 , further comprising scaling means for scaling the error associated with said first speech (frame erasure, concealing frame erasure) signal prior to comparison by said comparator .

US5717824A
CLAIM 14
. The speech coder of claim 13 , wherein said accumulator accumulates the first and second error value (frame concealment) s associated with two subframes .

US5717824A
CLAIM 21
. The speech coder of claim 19 , further comprising a high p (average pitch, E q) ass filter interposed between said first and second codebook search means and said weighting filter means .